ArrayFormula of Average on Infinite Truly Dynamic Range di Google Sheets
seperti contoh:
A B C D E F G ∞
|======|=======|=====|=====|=====|=====|=====|=====
1 | |AVERAGE| | | | | |
|======|=======|=====|=====|=====|=====|=====|=====
2 | xx 1 | | 1 | 2 | 0.5 | 10 | |
|======|=======|=====|=====|=====|=====|=====|=====
3 | xx 2 | | 7 | 1 | | | |
|======|=======|=====|=====|=====|=====|=====|=====
4 | | | 0 | | | | |
|======|=======|=====|=====|=====|=====|=====|=====
5 | xx 3 | | 9 | 8 | 7 | 6 | |
|======|=======|=====|=====|=====|=====|=====|=====
6 | xx 4 | | 0 | 1 | 2 | 1 | |
|======|=======|=====|=====|=====|=====|=====|=====
7 | | | 1 | | 4 | | |
|======|=======|=====|=====|=====|=====|=====|=====
8 | xx 5 | | | | | | |
|======|=======|=====|=====|=====|=====|=====|=====
9 | | | | | | | 5 |
|======|=======|=====|=====|=====|=====|=====|=====
∞ | | | | | | | |
Apa cara yang paling optimal untuk mendapatkan AVERAGEsetiap baris yang valid dalam pengertian istilah yang dinamis (jumlah baris yang tidak diketahui & jumlah kolom yang tidak diketahui)?
Jawaban
PERTANYAAN
tingkat 1:
Jika semua 5 sel dalam rentang C2: G memiliki nilai:
=QUERY(QUERY(C2:G, "select (C+D+E+F+G)/5"), "offset 1", )
jika tidak, baris akan dilewati:
jika sel kosong dianggap sebagai nol:
=INDEX(QUERY(QUERY({C2:G*1}, "select (Col1+Col2+Col3+Col4+Col5)/5"), "offset 1", ))
untuk menghapus nilai nol kami menggunakan IFERROR(1/(1/...))pembungkus:
=INDEX(IFERROR(1/(1/QUERY(QUERY({C2:G*1},
"select (Col1+Col2+Col3+Col4+Col5)/5"), "offset 1", ))))
untuk membuat Colreferensi dinamis, kita dapat melakukan:
=INDEX(IFERROR(1/(1/QUERY(QUERY({C2:G*1},
"select "&
"("&JOIN("+", "Col"&ROW(INDIRECT("1:"&COLUMNS(C:G))))&")/"&COLUMNS(C:G)),
"offset 1", ))))
level 2:
jika sel kosong tidak dianggap sebagai nol dan tidak boleh dilewati:
=INDEX(TRANSPOSE(QUERY(TRANSPOSE(E2:I),
"select "&TEXTJOIN(",", 1, IF(A2:A="",,
"avg(Col"&ROW(A2:A)-ROW(A2)+1&")")))),, 2)
perhatikan bahwa ini adalah kolom A dependen, jadi nilai yang hilang di kolom A akan mengimbangi hasil
fakta menyenangkan !! kami dapat menukar avgke maxatau min:
untuk membebaskannya dari kurungan kolom A dan membuatnya berfungsi untuk setiap baris yang valid:
=INDEX(IFERROR(1/(1/TRANSPOSE(QUERY(TRANSPOSE(
IF(TRIM(TRANSPOSE(QUERY(TRANSPOSE(C2:G),,9^9)))="", C2:G*0, C2:G)),
"select "&TEXTJOIN(",", 1,
"avg(Col"&ROW(A2:A)-ROW(A2)+1&")"))))),, 2)
jika ada 0 dalam kisaran tidak boleh dirata-ratakan, kita dapat menambahkan pernyataan IF kecil:
=INDEX(IFERROR(1/(1/TRANSPOSE(QUERY(TRANSPOSE(
IF(TRIM(TRANSPOSE(QUERY(TRANSPOSE(
IF(C2:G>0, C2:G, )),,9^9)))="", C2:G*0,
IF(C2:G>0, C2:G, ))),
"select "&TEXTJOIN(",", 1,
"avg(Col"&ROW(A2:A)-ROW(A2)+1&")"))))),, 2)
di sini kami menggunakan apa yang disebut "penghancuran kueri vertikal" yang mengambil semua nilai dalam rentang tertentu dan memusatkannya ke satu kolom tunggal, di mana semua sel per setiap baris digabungkan dengan ruang kosong sebagai produk sampingan:
=FLATTEN(QUERY(TRANSPOSE(C2:G),,9^9))
Selain itu, ada juga "penghancuran kueri horizontal" :
=QUERY(C2:G,,9^9)
dan juga "penghancuran kueri ganda 360 ° pamungkas" yang menempatkan semua sel dari rentang ke dalam satu sel:
=QUERY(FLATTEN(QUERY(TRANSPOSE(C2:G),,9^9)),,9^9)
dan terakhir "penghancuran kueri ganda terbalik 360 ° negatif yang terkenal" yang memprioritaskan kolom di atas baris:
=QUERY(FLATTEN(QUERY(C2:G,,9^9)),,9^9)
semua nama permintaan smash tentu saja memiliki hak cipta
back to the topic... as mentioned above all cells per row in range are joined with empty space even those empty ones, so we got a situation where we getting double or multiple spaces between values. to fix this we use TRIM and introduce a simple IF statement to assign 0 values for empty rows in a given range eg. to counter the offset:
MMULT
level 3:
MMULT is a kind of heavy class formula that is able to perform addition, subtraction, multiplication, division even running total on arrays/matrixes... however, bigger the dataset = slower the formula calculation (because in MMULT even empty rows take time to perform + - × ÷ operation) ...unless we use truly dynamic range infinite in both directions...
to get the last row with values of a given range:
=INDEX(MAX(IF(TRIM(FLATTEN(QUERY(TRANSPOSE(
INDIRECT("C2:"&ROWS(A:A))),,9^9)))="",,ROW(A2:A))))
untuk mendapatkan kolom terakhir dengan nilai dari rentang tertentu:
=INDEX(MAX(IF(TRIM(QUERY(INDIRECT("C2:"&ROWS(A:A)),,9^9))="",,COLUMN(C2:2))))
sekarang kita dapat membangunnya dengan cara yang sederhana:
=INDIRECT("C2:"&ADDRESS(9, 7))
yang sama dengan:
=INDEX(INDIRECT("C2:"&ADDRESS(MAX(IF(TRIM(FLATTEN(QUERY(TRANSPOSE(
INDIRECT("C2:"&ROWS(A:A))),,9^9)))="",,ROW(A2:A))),
MAX(IF(TRIM(QUERY(INDIRECT("C2:"&ROWS(A:A)),,9^9))="",,COLUMN(C2:2))))))
atau alternatif yang lebih pendek:
=INDEX(INDIRECT("C2:"&ADDRESS(
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*ROW(A2:A)),
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*COLUMN(C2:2)))))
oleh karena itu rumus MMULT yang disederhanakan akan menjadi:
=ARRAYFORMULA(IFERROR(
MMULT(N( C2:G9), ROW(INDIRECT("C1:C"&COLUMNS(C:G)))^0)/
MMULT(N(IF(C2:G9<>"", 1, )), ROW(INDIRECT("C1:C"&COLUMNS(C:G)))^0)))
jika kami ingin mengecualikan nilai nol dari rentang, rumusnya adalah:
=ARRAYFORMULA(IFERROR(
MMULT(N( C2:G9), ROW(INDIRECT("C1:C"&COLUMNS(C:G)))^0)/
MMULT(N(IF(C2:G9>0, 1, )), ROW(INDIRECT("C1:C"&COLUMNS(C:G)))^0)))
level 4:
menggabungkan semua hal di atas untuk membuatnya dinamis tanpa batas dan masih terbatas pada kumpulan data yang valid:
=INDEX(IFERROR(
MMULT(N( INDIRECT("C2:"&ADDRESS(
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*ROW(A2:A)),
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*COLUMN(C2:2))))), ROW(INDIRECT("C1:C"&
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*COLUMN(C2:2))-(COLUMN(C2)-1)))^0)/
MMULT(N(IF(INDIRECT("C2:"&ADDRESS(
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*ROW(A2:A)),
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*COLUMN(C2:2))))<>"", 1, )), ROW(INDIRECT("C1:C"&
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*COLUMN(C2:2))-(COLUMN(C2)-1)))^0)))
sekali lagi, tidak termasuk sel dengan nol dalam kisaran:
sebutan terhormat:
@Erik Tyler level:
kebalikan dari rumus sebelumnya akan menjalankan MMULTpada
- luas total, bukan
C2:?(all rows, all columns) - area valid yang menghindari penghitungan massa
C2:?(excluding empty rows and columns)0 × 0 = 0
termasuk nol:
=INDEX(IFERROR(
MMULT( INDIRECT("C2:"&ROWS(C:C))*1, SEQUENCE(COLUMNS(C2:2))^0)/
MMULT(IF(INDIRECT("C2:"&ROWS(C:C))<>"", 1)*1, SEQUENCE(COLUMNS(C2:2))^0)))
tidak termasuk nol:
=INDEX(IFERROR(
MMULT( INDIRECT("C2:"&ROWS(C:C))*1, SEQUENCE(COLUMNS(C2:2))^0)/
MMULT(IF(INDIRECT("C2:"&ROWS(C:C))>0, 1)*1, SEQUENCE(COLUMNS(C2:2))^0)))
@kishkin level:
untuk rentang tetap rata C2:G9- MMULTrata adalah:
=INDEX(IFERROR(
MMULT( C2:G9*1, FLATTEN(COLUMN(C:G))^0)/
MMULT((C2:G9>0)*1, FLATTEN(COLUMN(C:G))^0)))
=INDEX(IFNA(VLOOKUP(ROW(C2:C),
QUERY(SPLIT(FLATTEN(ROW(C2:C)&"×"&C2:J), "×"),
"select Col1,avg(Col2)
where Col2 is not null
group by Col1"), 2, )))
Tingkat @MattKing :
=INDEX(QUERY(SPLIT(FLATTEN(ROW(C2:C)&"×"&OFFSET(C2,,,9^9, 9^9)), "×"),
"select avg(Col2)
group by Col1
label avg(Col2)''"))
tidak termasuk nol:
=INDEX(QUERY(SPLIT(FLATTEN(ROW(C2:C)&"×"&OFFSET(C2,,,9^9, 9^9)), "×"),
"select avg(Col2)
where Col2 <> 0
group by Col1
label avg(Col2)''"))
including empty cells:
=INDEX(IFERROR(1/(1/QUERY(SPLIT(FLATTEN(ROW(C2:C)&"×"&OFFSET(C2,,,9^9, 9^9)*1), "×"),
"select avg(Col2)
group by Col1
label avg(Col2)''"))))
You put a ton of time into this. I hope people appreciate it, more so that you did it for everyone else and not for yourself.
Looking at your final formulas, these should produce the same results (give data in C2:? as in your examples):
In B2 (include zeros):
=ArrayFormula(IFERROR(MMULT(INDIRECT("C2:"&ROWS(C:C))*1,SEQUENCE(COLUMNS(C1:1),1,1,0))/ MMULT(IF(INDIRECT("C2:"&ROWS(C:C))<>"",1,0),SEQUENCE(COLUMNS(C1:1),1,1,0))))
In B2 (exclude zeros):
=ArrayFormula(IFERROR(MMULT(INDIRECT("C2:"&ROWS(C:C))*1,SEQUENCE(COLUMNS(C1:1),1,1,0))/ MMULT(IF(INDIRECT("C2:"&ROWS(C:C))<>0,1,0),SEQUENCE(COLUMNS(C1:1),1,1,0))))
UPDATE: I've updated the formula from my original post. The ROW() should always come first so that missing values in the data don't throw off the split.
=ARRAYFORMULA(QUERY(SPLIT(FLATTEN(ROW(C2:C)&"|"&OFFSET(C2,,,9^9,9^9)),"|"),"select AVG(Col2) group by Col1 label AVG(Col2)''"))
Should work unless I'm misunderstanding the question.
No need for vlookups or mmults or filters or anything.
I will try to make a little addition to @player0's answer. And I will really appreciate any comments on optimizing this.
In case there is a lot of empty rows and columns inside the data range those might as well be excluded from MMULT.
Step 1 - Filter out empty rows
We've got a data range: from C2 down to the last row and right to the last column (which is J:J). I will use C2:K, see details below for explanation.
This formula will give us an array of row numbers where there is at least one non empty cell. Also it will have a 0 if there are empty rows, but it won't matter for searching in this array, or we will filter it out when it does matter:
=ARRAYFORMULA(
UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K)))
)
So, to filter out empty rows from the data range we use FILTER which will check if a row is in our array from above and leave if be in that case:
=ARRAYFORMULA(
FILTER(
C2:K*1,
MATCH(
ROW(C2:K),
UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K))),
0
)
)
)
Step 2 - Filter out empty columns
To get an array of only non-empty column numbers we can use almost the same formula:
=ARRAYFORMULA(
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2))))
)
Why SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)) is used instead of COLUMN(C2:K) see details at the end.
To filter out empty columns we also use FILTER with MATCH condition to search for column numbers in our array:
=ARRAYFORMULA(
FILTER(
C2:K*1,
MATCH(
SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)),
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
0
)
)
)
And to filter out empty rows and empty columns we just use two FILTERs:
=ARRAYFORMULA(
FILTER(
FILTER(
C2:K*1,
MATCH(
ROW(C2:K),
UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K))),
0
)
),
MATCH(
SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)),
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
0
)
)
)
Original data range will internally become:
Step 3 - Do the MMULT
Now we can use MMULT with that data set to calculate average:
=ARRAYFORMULA(
MMULT(
FILTER(
FILTER(
C2:K*1,
MATCH(
ROW(C2:K),
UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K))),
0
)
),
MATCH(
SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)),
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
0
)
),
SEQUENCE(
ROWS(
QUERY(
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
"WHERE Col1 <> 0"
)
),
1,
1,
0
)
) /
MMULT(
FILTER(
FILTER(
(C2:K <> "")*1,
MATCH(
ROW(C2:K),
UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K))),
0
)
),
MATCH(
SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)),
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
0
)
),
SEQUENCE(
ROWS(
QUERY(
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
"WHERE Col1 <> 0"
)
),
1,
1,
0
)
)
)
It is a bit off regarding original data rows.
Step 4 - Fill the AVERAGE column
To make averages consistent with the original data rows we can use VLOOKUP like this:
=ARRAYFORMULA(
IFNA(VLOOKUP(
SEQUENCE(MAX((C2:K <> "") * ROW(C2:K)) - 1, 1, ROW(C2)),
{
QUERY(UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K))), "WHERE Col1 <> 0"),
MMULT(
...
) /
MMULT(
...
)
},
2,
0
))
)
Where
SEQUENCE(MAX((C2:K <> "") * ROW(C2:K)) - 1, 1, ROW(C2))is an array of row numbers from the 2nd one to the last none-empty one. We won't be filling all the rows down with empty strings.QUERY(UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K))), "WHERE Col1 <> 0")is an array of non-empty row numbers with that0filtered out used as keys for search.IFNAwill return an empty string to put alongside an empty data row.
FINAL FORMULA
Menyatukan semuanya:
=ARRAYFORMULA(
IFNA(VLOOKUP(
SEQUENCE(MAX((C2:K <> "") * ROW(C2:K)) - 1, 1, ROW(C2)),
{
QUERY(UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K))), "WHERE Col1 <> 0"),
MMULT(
FILTER(
FILTER(
C2:K*1,
MATCH(
ROW(C2:K),
UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K))),
0
)
),
MATCH(
SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)),
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
0
)
),
SEQUENCE(
ROWS(
QUERY(
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
"WHERE Col1 <> 0"
)
),
1,
1,
0
)
) /
MMULT(
FILTER(
FILTER(
(C2:K <> "")*1,
MATCH(
ROW(C2:K),
UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K))),
0
)
),
MATCH(
SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)),
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
0
)
),
SEQUENCE(
ROWS(
QUERY(
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
"WHERE Col1 <> 0"
)
),
1,
1,
0
)
)
},
2,
0
))
)
Sedikit detail
INDEXdapat digunakan sebagai penggantiARRAYFORMULAsingkatnya (terima kasih @ player0, mengajari saya hal itu beberapa bulan yang lalu), tetapi saya suka ketidakjelasanARRAYFORMULA.- Saya gunakan
SEQUENCEuntuk membangun kolom atau baris1s agar eksplisit, untuk kejelasan. Misalnya yang ini
SEQUENCE(
ROWS(
QUERY(
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
"WHERE Col1 <> 0"
)
),
1,
1,
0
)
bisa diganti dengan
SIGN(
QUERY(
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
"WHERE Col1 <> 0"
)
)
yang sedikit lebih pendek. Ada juga cara yang ditunjukkan di sini oleh @ player0 untuk meningkatkan kekuatan 0:
QUERY(
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
"WHERE Col1 <> 0"
)^0
tetapi (ini hanya spekulasi saya) saya pikir SEQUENCEimplementasi internal harus lebih sederhana daripada operasi peningkatan kekuasaan.
- I use range
C2:Kwhich is one column more than there actually exist on the sheet. Not only it gives a range of all the columns to the right ofC2and all the rows down from it, but it also updates in case of adding another column to the right of the sheet: a demo. Though it does not get to be highlighted. ThisC2:Kcan almost perfectly (there will be a problem in case there is actuallyZZZcolumn present on a sheet) replace those approaches:
INDIRECT("C2:" & ROWS(C:C))
OFFSET(C2,,, ROWS(C2:C), COLUMNS(C2:2))
- There is a small drawback in using
C2:K:=ARRAYFORMULA(COLUMN(C2:K))will return an array of column numbers even for non-existing ones, so we need to use=SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2))instead.
I think there is a simple answer for row-wise average using VLOOKUP and QUERY.
This one is in B2:
=ARRAYFORMULA(
IFNA(
VLOOKUP(
ROW(B2:B),
QUERY(
{
FLATTEN(ROW(C2:J) + SEQUENCE(1, COLUMNS(C2:J),,)),
FLATTEN(C2:J)
},
"SELECT Col1, AVG(Col2)
WHERE Col2 IS NOT NULL
GROUP BY Col1"
),
2,
0
)
)
)
- Ini dapat dengan mudah diubah untuk max, min, sum, count - cukup ubah fungsi agregasi di dalam
QUERYpernyataan. - Pendekatan yang sama dapat digunakan untuk agregasi berdasarkan kolom.
FLATTEN(C2:J)bisa diubah menjadi:FLATTEN(--C2:J)untuk memperlakukan sel kosong sebagai0s;FLATTEN(IFERROR(1/(1/C2:J)))untuk mengecualikan0s dari rata-rata.
- Jika tidak ada baris kosong perantara,
VLOOKUPbisa dihapus dari rumus, sertaCol1dariSELECTpernyataan. - Ada versi yang lebih pendek (terima kasih @MattKing!) Tanpa
VLOOKUPdanWHERE Col...:
=ARRAYFORMULA(
QUERY(
{
FLATTEN(ROW(C2:J) + SEQUENCE(1, COLUMNS(C2:J),,)),
FLATTEN(IFERROR(1/(1/C2:J)))
},
"SELECT AVG(Col2)
GROUP BY Col1
LABEL AVG(Col2) ''"
)
)
Saya menggunakan C2:Jrentang yang memiliki kolom hingga I:I, beberapa detail tentang itu:
- Rentang
C2:Jyang satu kolom lebih banyak dari yang sebenarnya ada di lembar. Tidak hanya memberikan rentang semua kolom di sebelah kananC2dan semua baris di bawahnya, tetapi juga memperbarui jika menambahkan kolom lain di sebelah kanan sheet: demo . Padahal itu tidak bisa disorot. IniC2:Jhampir bisa sempurna (akan ada masalah jika sebenarnyaZZZada kolom yang ada di selembar kertas) menggantikan pendekatan tersebut:
INDIRECT("C2:" & ROWS(C:C))
OFFSET(C2,,, ROWS(C2:C), COLUMNS(C2:2))
- Ada sedikit kelemahan dalam menggunakan
C2:J:=ARRAYFORMULA(0 * COLUMN(C2:J))akan mengembalikan larik nomor kolom bahkan untuk yang tidak ada (dikalikan dengan0), jadi kita perlu menggunakan=SEQUENCE(1, COLUMNS(C2:J),,)sebagai gantinya.
@ player0, ada pemikiran tentang ini?