ArrayFormula of Average on Infinite Truly Dynamic Range di Google Sheets
seperti contoh:
A B C D E F G ∞
|======|=======|=====|=====|=====|=====|=====|=====
1 | |AVERAGE| | | | | |
|======|=======|=====|=====|=====|=====|=====|=====
2 | xx 1 | | 1 | 2 | 0.5 | 10 | |
|======|=======|=====|=====|=====|=====|=====|=====
3 | xx 2 | | 7 | 1 | | | |
|======|=======|=====|=====|=====|=====|=====|=====
4 | | | 0 | | | | |
|======|=======|=====|=====|=====|=====|=====|=====
5 | xx 3 | | 9 | 8 | 7 | 6 | |
|======|=======|=====|=====|=====|=====|=====|=====
6 | xx 4 | | 0 | 1 | 2 | 1 | |
|======|=======|=====|=====|=====|=====|=====|=====
7 | | | 1 | | 4 | | |
|======|=======|=====|=====|=====|=====|=====|=====
8 | xx 5 | | | | | | |
|======|=======|=====|=====|=====|=====|=====|=====
9 | | | | | | | 5 |
|======|=======|=====|=====|=====|=====|=====|=====
∞ | | | | | | | |
Apa cara yang paling optimal untuk mendapatkan AVERAGE
setiap baris yang valid dalam pengertian istilah yang dinamis (jumlah baris yang tidak diketahui & jumlah kolom yang tidak diketahui)?
Jawaban
PERTANYAAN
tingkat 1:
Jika semua 5 sel dalam rentang C2: G memiliki nilai:
=QUERY(QUERY(C2:G, "select (C+D+E+F+G)/5"), "offset 1", )
jika tidak, baris akan dilewati:
jika sel kosong dianggap sebagai nol:
=INDEX(QUERY(QUERY({C2:G*1}, "select (Col1+Col2+Col3+Col4+Col5)/5"), "offset 1", ))
untuk menghapus nilai nol kami menggunakan IFERROR(1/(1/...))
pembungkus:
=INDEX(IFERROR(1/(1/QUERY(QUERY({C2:G*1},
"select (Col1+Col2+Col3+Col4+Col5)/5"), "offset 1", ))))
untuk membuat Col
referensi dinamis, kita dapat melakukan:
=INDEX(IFERROR(1/(1/QUERY(QUERY({C2:G*1},
"select "&
"("&JOIN("+", "Col"&ROW(INDIRECT("1:"&COLUMNS(C:G))))&")/"&COLUMNS(C:G)),
"offset 1", ))))
level 2:
jika sel kosong tidak dianggap sebagai nol dan tidak boleh dilewati:
=INDEX(TRANSPOSE(QUERY(TRANSPOSE(E2:I),
"select "&TEXTJOIN(",", 1, IF(A2:A="",,
"avg(Col"&ROW(A2:A)-ROW(A2)+1&")")))),, 2)
perhatikan bahwa ini adalah kolom A dependen, jadi nilai yang hilang di kolom A akan mengimbangi hasil
fakta menyenangkan !! kami dapat menukar avg
ke max
atau min
:
untuk membebaskannya dari kurungan kolom A dan membuatnya berfungsi untuk setiap baris yang valid:
=INDEX(IFERROR(1/(1/TRANSPOSE(QUERY(TRANSPOSE(
IF(TRIM(TRANSPOSE(QUERY(TRANSPOSE(C2:G),,9^9)))="", C2:G*0, C2:G)),
"select "&TEXTJOIN(",", 1,
"avg(Col"&ROW(A2:A)-ROW(A2)+1&")"))))),, 2)
jika ada 0 dalam kisaran tidak boleh dirata-ratakan, kita dapat menambahkan pernyataan IF kecil:
=INDEX(IFERROR(1/(1/TRANSPOSE(QUERY(TRANSPOSE(
IF(TRIM(TRANSPOSE(QUERY(TRANSPOSE(
IF(C2:G>0, C2:G, )),,9^9)))="", C2:G*0,
IF(C2:G>0, C2:G, ))),
"select "&TEXTJOIN(",", 1,
"avg(Col"&ROW(A2:A)-ROW(A2)+1&")"))))),, 2)
di sini kami menggunakan apa yang disebut "penghancuran kueri vertikal" yang mengambil semua nilai dalam rentang tertentu dan memusatkannya ke satu kolom tunggal, di mana semua sel per setiap baris digabungkan dengan ruang kosong sebagai produk sampingan:
=FLATTEN(QUERY(TRANSPOSE(C2:G),,9^9))
Selain itu, ada juga "penghancuran kueri horizontal" :
=QUERY(C2:G,,9^9)
dan juga "penghancuran kueri ganda 360 ° pamungkas" yang menempatkan semua sel dari rentang ke dalam satu sel:
=QUERY(FLATTEN(QUERY(TRANSPOSE(C2:G),,9^9)),,9^9)
dan terakhir "penghancuran kueri ganda terbalik 360 ° negatif yang terkenal" yang memprioritaskan kolom di atas baris:
=QUERY(FLATTEN(QUERY(C2:G,,9^9)),,9^9)
semua nama permintaan smash tentu saja memiliki hak cipta
back to the topic... as mentioned above all cells per row in range are joined with empty space even those empty ones, so we got a situation where we getting double or multiple spaces between values. to fix this we use TRIM
and introduce a simple IF
statement to assign 0 values for empty rows in a given range eg. to counter the offset:
MMULT
level 3:
MMULT
is a kind of heavy class formula that is able to perform addition, subtraction, multiplication, division even running total on arrays/matrixes... however, bigger the dataset = slower the formula calculation (because in MMULT
even empty rows take time to perform + - × ÷
operation) ...unless we use truly dynamic range infinite in both directions...
to get the last row with values of a given range:
=INDEX(MAX(IF(TRIM(FLATTEN(QUERY(TRANSPOSE(
INDIRECT("C2:"&ROWS(A:A))),,9^9)))="",,ROW(A2:A))))
untuk mendapatkan kolom terakhir dengan nilai dari rentang tertentu:
=INDEX(MAX(IF(TRIM(QUERY(INDIRECT("C2:"&ROWS(A:A)),,9^9))="",,COLUMN(C2:2))))
sekarang kita dapat membangunnya dengan cara yang sederhana:
=INDIRECT("C2:"&ADDRESS(9, 7))
yang sama dengan:
=INDEX(INDIRECT("C2:"&ADDRESS(MAX(IF(TRIM(FLATTEN(QUERY(TRANSPOSE(
INDIRECT("C2:"&ROWS(A:A))),,9^9)))="",,ROW(A2:A))),
MAX(IF(TRIM(QUERY(INDIRECT("C2:"&ROWS(A:A)),,9^9))="",,COLUMN(C2:2))))))
atau alternatif yang lebih pendek:
=INDEX(INDIRECT("C2:"&ADDRESS(
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*ROW(A2:A)),
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*COLUMN(C2:2)))))
oleh karena itu rumus MMULT yang disederhanakan akan menjadi:
=ARRAYFORMULA(IFERROR(
MMULT(N( C2:G9), ROW(INDIRECT("C1:C"&COLUMNS(C:G)))^0)/
MMULT(N(IF(C2:G9<>"", 1, )), ROW(INDIRECT("C1:C"&COLUMNS(C:G)))^0)))
jika kami ingin mengecualikan nilai nol dari rentang, rumusnya adalah:
=ARRAYFORMULA(IFERROR(
MMULT(N( C2:G9), ROW(INDIRECT("C1:C"&COLUMNS(C:G)))^0)/
MMULT(N(IF(C2:G9>0, 1, )), ROW(INDIRECT("C1:C"&COLUMNS(C:G)))^0)))
level 4:
menggabungkan semua hal di atas untuk membuatnya dinamis tanpa batas dan masih terbatas pada kumpulan data yang valid:
=INDEX(IFERROR(
MMULT(N( INDIRECT("C2:"&ADDRESS(
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*ROW(A2:A)),
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*COLUMN(C2:2))))), ROW(INDIRECT("C1:C"&
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*COLUMN(C2:2))-(COLUMN(C2)-1)))^0)/
MMULT(N(IF(INDIRECT("C2:"&ADDRESS(
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*ROW(A2:A)),
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*COLUMN(C2:2))))<>"", 1, )), ROW(INDIRECT("C1:C"&
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*COLUMN(C2:2))-(COLUMN(C2)-1)))^0)))
sekali lagi, tidak termasuk sel dengan nol dalam kisaran:
sebutan terhormat:
@Erik Tyler level:
kebalikan dari rumus sebelumnya akan menjalankan MMULT
pada
- luas total, bukan
C2:?
(all rows, all columns)
- area valid yang menghindari penghitungan massa
C2:?
(excluding empty rows and columns)
0 × 0 = 0
termasuk nol:
=INDEX(IFERROR(
MMULT( INDIRECT("C2:"&ROWS(C:C))*1, SEQUENCE(COLUMNS(C2:2))^0)/
MMULT(IF(INDIRECT("C2:"&ROWS(C:C))<>"", 1)*1, SEQUENCE(COLUMNS(C2:2))^0)))
tidak termasuk nol:
=INDEX(IFERROR(
MMULT( INDIRECT("C2:"&ROWS(C:C))*1, SEQUENCE(COLUMNS(C2:2))^0)/
MMULT(IF(INDIRECT("C2:"&ROWS(C:C))>0, 1)*1, SEQUENCE(COLUMNS(C2:2))^0)))
@kishkin level:
untuk rentang tetap rata C2:G9
- MMULT
rata adalah:
=INDEX(IFERROR(
MMULT( C2:G9*1, FLATTEN(COLUMN(C:G))^0)/
MMULT((C2:G9>0)*1, FLATTEN(COLUMN(C:G))^0)))
=INDEX(IFNA(VLOOKUP(ROW(C2:C),
QUERY(SPLIT(FLATTEN(ROW(C2:C)&"×"&C2:J), "×"),
"select Col1,avg(Col2)
where Col2 is not null
group by Col1"), 2, )))
Tingkat @MattKing :
=INDEX(QUERY(SPLIT(FLATTEN(ROW(C2:C)&"×"&OFFSET(C2,,,9^9, 9^9)), "×"),
"select avg(Col2)
group by Col1
label avg(Col2)''"))
tidak termasuk nol:
=INDEX(QUERY(SPLIT(FLATTEN(ROW(C2:C)&"×"&OFFSET(C2,,,9^9, 9^9)), "×"),
"select avg(Col2)
where Col2 <> 0
group by Col1
label avg(Col2)''"))
including empty cells:
=INDEX(IFERROR(1/(1/QUERY(SPLIT(FLATTEN(ROW(C2:C)&"×"&OFFSET(C2,,,9^9, 9^9)*1), "×"),
"select avg(Col2)
group by Col1
label avg(Col2)''"))))
You put a ton of time into this. I hope people appreciate it, more so that you did it for everyone else and not for yourself.
Looking at your final formulas, these should produce the same results (give data in C2:? as in your examples):
In B2 (include zeros):
=ArrayFormula(IFERROR(MMULT(INDIRECT("C2:"&ROWS(C:C))*1,SEQUENCE(COLUMNS(C1:1),1,1,0))/ MMULT(IF(INDIRECT("C2:"&ROWS(C:C))<>"",1,0),SEQUENCE(COLUMNS(C1:1),1,1,0))))
In B2 (exclude zeros):
=ArrayFormula(IFERROR(MMULT(INDIRECT("C2:"&ROWS(C:C))*1,SEQUENCE(COLUMNS(C1:1),1,1,0))/ MMULT(IF(INDIRECT("C2:"&ROWS(C:C))<>0,1,0),SEQUENCE(COLUMNS(C1:1),1,1,0))))
UPDATE: I've updated the formula from my original post. The ROW() should always come first so that missing values in the data don't throw off the split.
=ARRAYFORMULA(QUERY(SPLIT(FLATTEN(ROW(C2:C)&"|"&OFFSET(C2,,,9^9,9^9)),"|"),"select AVG(Col2) group by Col1 label AVG(Col2)''"))
Should work unless I'm misunderstanding the question.
No need for vlookups or mmults or filters or anything.
I will try to make a little addition to @player0's answer. And I will really appreciate any comments on optimizing this.
In case there is a lot of empty rows and columns inside the data range those might as well be excluded from MMULT
.
Step 1 - Filter out empty rows
We've got a data range: from C2
down to the last row and right to the last column (which is J:J
). I will use C2:K
, see details below for explanation.
This formula will give us an array of row numbers where there is at least one non empty cell. Also it will have a 0
if there are empty rows, but it won't matter for searching in this array, or we will filter it out when it does matter:
=ARRAYFORMULA(
UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K)))
)
So, to filter out empty rows from the data range we use FILTER
which will check if a row is in our array from above and leave if be in that case:
=ARRAYFORMULA(
FILTER(
C2:K*1,
MATCH(
ROW(C2:K),
UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K))),
0
)
)
)
Step 2 - Filter out empty columns
To get an array of only non-empty column numbers we can use almost the same formula:
=ARRAYFORMULA(
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2))))
)
Why SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2))
is used instead of COLUMN(C2:K)
see details at the end.
To filter out empty columns we also use FILTER
with MATCH
condition to search for column numbers in our array:
=ARRAYFORMULA(
FILTER(
C2:K*1,
MATCH(
SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)),
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
0
)
)
)
And to filter out empty rows and empty columns we just use two FILTER
s:
=ARRAYFORMULA(
FILTER(
FILTER(
C2:K*1,
MATCH(
ROW(C2:K),
UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K))),
0
)
),
MATCH(
SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)),
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
0
)
)
)
Original data range will internally become:
Step 3 - Do the MMULT
Now we can use MMULT
with that data set to calculate average:
=ARRAYFORMULA(
MMULT(
FILTER(
FILTER(
C2:K*1,
MATCH(
ROW(C2:K),
UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K))),
0
)
),
MATCH(
SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)),
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
0
)
),
SEQUENCE(
ROWS(
QUERY(
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
"WHERE Col1 <> 0"
)
),
1,
1,
0
)
) /
MMULT(
FILTER(
FILTER(
(C2:K <> "")*1,
MATCH(
ROW(C2:K),
UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K))),
0
)
),
MATCH(
SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)),
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
0
)
),
SEQUENCE(
ROWS(
QUERY(
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
"WHERE Col1 <> 0"
)
),
1,
1,
0
)
)
)
It is a bit off regarding original data rows.
Step 4 - Fill the AVERAGE column
To make averages consistent with the original data rows we can use VLOOKUP
like this:
=ARRAYFORMULA(
IFNA(VLOOKUP(
SEQUENCE(MAX((C2:K <> "") * ROW(C2:K)) - 1, 1, ROW(C2)),
{
QUERY(UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K))), "WHERE Col1 <> 0"),
MMULT(
...
) /
MMULT(
...
)
},
2,
0
))
)
Where
SEQUENCE(MAX((C2:K <> "") * ROW(C2:K)) - 1, 1, ROW(C2))
is an array of row numbers from the 2nd one to the last none-empty one. We won't be filling all the rows down with empty strings.QUERY(UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K))), "WHERE Col1 <> 0")
is an array of non-empty row numbers with that0
filtered out used as keys for search.IFNA
will return an empty string to put alongside an empty data row.
FINAL FORMULA
Menyatukan semuanya:
=ARRAYFORMULA(
IFNA(VLOOKUP(
SEQUENCE(MAX((C2:K <> "") * ROW(C2:K)) - 1, 1, ROW(C2)),
{
QUERY(UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K))), "WHERE Col1 <> 0"),
MMULT(
FILTER(
FILTER(
C2:K*1,
MATCH(
ROW(C2:K),
UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K))),
0
)
),
MATCH(
SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)),
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
0
)
),
SEQUENCE(
ROWS(
QUERY(
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
"WHERE Col1 <> 0"
)
),
1,
1,
0
)
) /
MMULT(
FILTER(
FILTER(
(C2:K <> "")*1,
MATCH(
ROW(C2:K),
UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K))),
0
)
),
MATCH(
SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)),
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
0
)
),
SEQUENCE(
ROWS(
QUERY(
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
"WHERE Col1 <> 0"
)
),
1,
1,
0
)
)
},
2,
0
))
)
Sedikit detail
INDEX
dapat digunakan sebagai penggantiARRAYFORMULA
singkatnya (terima kasih @ player0, mengajari saya hal itu beberapa bulan yang lalu), tetapi saya suka ketidakjelasanARRAYFORMULA
.- Saya gunakan
SEQUENCE
untuk membangun kolom atau baris1
s agar eksplisit, untuk kejelasan. Misalnya yang ini
SEQUENCE(
ROWS(
QUERY(
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
"WHERE Col1 <> 0"
)
),
1,
1,
0
)
bisa diganti dengan
SIGN(
QUERY(
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
"WHERE Col1 <> 0"
)
)
yang sedikit lebih pendek. Ada juga cara yang ditunjukkan di sini oleh @ player0 untuk meningkatkan kekuatan 0
:
QUERY(
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
"WHERE Col1 <> 0"
)^0
tetapi (ini hanya spekulasi saya) saya pikir SEQUENCE
implementasi internal harus lebih sederhana daripada operasi peningkatan kekuasaan.
- I use range
C2:K
which is one column more than there actually exist on the sheet. Not only it gives a range of all the columns to the right ofC2
and all the rows down from it, but it also updates in case of adding another column to the right of the sheet: a demo. Though it does not get to be highlighted. ThisC2:K
can almost perfectly (there will be a problem in case there is actuallyZZZ
column present on a sheet) replace those approaches:
INDIRECT("C2:" & ROWS(C:C))
OFFSET(C2,,, ROWS(C2:C), COLUMNS(C2:2))
- There is a small drawback in using
C2:K
:=ARRAYFORMULA(COLUMN(C2:K))
will return an array of column numbers even for non-existing ones, so we need to use=SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2))
instead.
I think there is a simple answer for row-wise average using VLOOKUP
and QUERY
.
This one is in B2
:
=ARRAYFORMULA(
IFNA(
VLOOKUP(
ROW(B2:B),
QUERY(
{
FLATTEN(ROW(C2:J) + SEQUENCE(1, COLUMNS(C2:J),,)),
FLATTEN(C2:J)
},
"SELECT Col1, AVG(Col2)
WHERE Col2 IS NOT NULL
GROUP BY Col1"
),
2,
0
)
)
)
- Ini dapat dengan mudah diubah untuk max, min, sum, count - cukup ubah fungsi agregasi di dalam
QUERY
pernyataan. - Pendekatan yang sama dapat digunakan untuk agregasi berdasarkan kolom.
FLATTEN(C2:J)
bisa diubah menjadi:FLATTEN(--C2:J)
untuk memperlakukan sel kosong sebagai0
s;FLATTEN(IFERROR(1/(1/C2:J)))
untuk mengecualikan0
s dari rata-rata.
- Jika tidak ada baris kosong perantara,
VLOOKUP
bisa dihapus dari rumus, sertaCol1
dariSELECT
pernyataan. - Ada versi yang lebih pendek (terima kasih @MattKing!) Tanpa
VLOOKUP
danWHERE Col...
:
=ARRAYFORMULA(
QUERY(
{
FLATTEN(ROW(C2:J) + SEQUENCE(1, COLUMNS(C2:J),,)),
FLATTEN(IFERROR(1/(1/C2:J)))
},
"SELECT AVG(Col2)
GROUP BY Col1
LABEL AVG(Col2) ''"
)
)
Saya menggunakan C2:J
rentang yang memiliki kolom hingga I:I
, beberapa detail tentang itu:
- Rentang
C2:J
yang satu kolom lebih banyak dari yang sebenarnya ada di lembar. Tidak hanya memberikan rentang semua kolom di sebelah kananC2
dan semua baris di bawahnya, tetapi juga memperbarui jika menambahkan kolom lain di sebelah kanan sheet: demo . Padahal itu tidak bisa disorot. IniC2:J
hampir bisa sempurna (akan ada masalah jika sebenarnyaZZZ
ada kolom yang ada di selembar kertas) menggantikan pendekatan tersebut:
INDIRECT("C2:" & ROWS(C:C))
OFFSET(C2,,, ROWS(C2:C), COLUMNS(C2:2))
- Ada sedikit kelemahan dalam menggunakan
C2:J
:=ARRAYFORMULA(0 * COLUMN(C2:J))
akan mengembalikan larik nomor kolom bahkan untuk yang tidak ada (dikalikan dengan0
), jadi kita perlu menggunakan=SEQUENCE(1, COLUMNS(C2:J),,)
sebagai gantinya.
@ player0, ada pemikiran tentang ini?