[7] Okapi BM25 0
1 1 2 2 2.1 Web.............. 2 2.2...................... 3 2.2.1........................... 3 2.2.2........................... 3 3 5 3.1............................. 5 3.2............................. 5 3.3................................. 6 3.4............................. 6 4 8 4.1................................ 8 4.1.1......................... 8 4.1.2......................... 9 4.2................................ 13 4.2.1......................... 13 4.2.2....................... 13 4.3....................... 15 5 19 5.1.................................... 19 5.2.................................... 19 1
6 21 6.1.................................... 21 6.2................................. 21 6.3............................... 22 6.4............................... 22 6.5....................... 23 6.6...................................... 29 7 30 8 31 2
3.1................................ 6 4.1 NY:5 1............................. 11 4.2 NY:5 2............................. 11 4.3 NY:18 1......................... 11 4.4 NY:18 2......................... 12 4.5 NY:23 1.......................... 12 4.6 NY:23 2.......................... 12 4.7............... 15 4.8................. 16 4.9................ 17 6.1................................ 22 6.2...................... 23 6.3 A 24 6.4 B 24 6.5 A.. 26 6.6 B.. 26 6.7 A. 28 6.8 B. 28 7.1........................... 30 3
3.1................ 7 4.1................ 9 4.2........... 10 4.3............................... 13 4.4 30...... 14 6.1............................... 22 6.2........... 25 6.3............. 27 6.4............ 29 4
1 [1] [2] [3] 2 3 4 5 6 7 8 1
2 2.1 Web Web [1] Web [A] [B]; A: B: [2] (CRF) [2] 7 6 [2] [3] [3] 7 2
2.2 2.2.1 [4] [7] 14,819 7,320 8 [5] [7] 1,615 1,030 100 8 9 [6] 2.2.2 [7] 6,000 14,700 [7] 2,830 36 30 3
14,700 Okapi BM25 Okapi BM25 Q [8] D q score(d, q) score(d, Q) = q Q s(d, q) (2.1) f(q, D) (k + 1) s(d, q) = IDF (q) D f(q, D) + k (1 b + b ) (2.2) avgdl IDF (q) = log N n(q) + 0.5 n(q) + 0.5 (2.3) f(q, D) D q n(q) q D D avgdl N k b 4
3 1 1 3 3.1 Yahoo! > > > > 2010 7 16 3.2 [4],[5],[6] [6] 9 9 5 5 P N 5
A S P 4 5 3.3 3.1 ID E00126 3 E00125 E00126 E00127 3.1: 3.4 3.1 3.1 12,047 6,348 6
3.1: [ ] [ ] [ ] [ ] 21 662 199 357 180 5,028 1,199 2,660 243 6,357 1,496 3,331 444 12,047 2,894 6,348 7
4 5 4.1 4.1.1 [7] 36 4.1 1 NY:0 NY:36 NY:18 NY:1 36 NY:0 8
4.1: NY:1 NY:13 NY:25 NY:2 NY:14 NY:26 NY:3 NY:15 NY:27 NY:4 NY:16 NY:28 NY:5 NY:17 NY:29 NY:6 NY:18 NY:30 NY:7 NY:19 NY:31 NY:8 NY:20 NY:32 NY:9 NY:21 NY:33 NY:10 NY:22 NY:34 NY:11 NY:23 NY:35 NY:12 NY:24 NY:36 4.1.2 4.2 4.2 0 4.2 9
4.2: (a) NY NY: 5 50 NY:23 34 NY:18 32 NY:29 29 NY: 0 23 NY:31 21 NY:20 18 NY:32 17 NY:19 12 NY: 7 4 NY: 2 4 NY:16 4 NY:22 3 NY:11 3 NY:35 2 NY:15 2 NY:36 1 NY:30 1 NY:26 1 NY:25 1 NY:21 1 NY:12 1 NY:10 1 (b) NY NY: 5 463 NY:18 305 NY:23 252 NY:29 205 NY:32 190 NY:20 185 NY:31 125 NY:19 87 NY: 0 77 NY:30 65 NY:26 37 NY: 2 35 NY:11 33 NY: 7 32 NY:25 32 NY:22 29 NY:35 23 NY:16 20 NY:21 11 NY: 6 9 NY:15 6 NY:36 5 NY: 8 1 NY:27 1 NY:12 1 NY:10 1 (c) NY NY: 5 548 NY:18 414 NY:23 312 NY:32 259 NY:20 255 NY:29 254 NY:31 142 NY: 0 91 NY:19 83 NY:30 80 NY:22 48 NY: 7 47 NY:25 45 NY:16 36 NY:26 35 NY: 2 30 NY:11 24 NY:21 19 NY:35 15 NY: 6 12 NY:15 6 NY: 8 5 NY:36 4 NY:10 2 NY:27 1 10
3 NY:5 4.1 4.2 E00480 E00481 E00482 4.1: NY:5 1 W00410 W00411 W00412 4.2: NY:5 2 NY:5 4.1 4.2 NY:5 NY:18 4.3 4.4 E00182 E00183 E00184 4.3: NY:18 1 11
S05211 S05212 S05213 4.4: NY:18 2 NY:18 4.3 4.4 NY:18 NY:23 4.5 4.6 E00127 E00128 E00129 4.5: NY:23 1 S00282 S00283 S00284 4.6: NY:23 2 NY:23 4.5 4.6 NY:23 12
4.2 Okapi BM25 4.2.1 2.2.2 (2.2) (2.2) q 1 (2.2) r 1 D r q s(d r, q) k = 2.0, b = 0.75 4.2.2 4.3 4.3: [ ] 447 2374 2601 5422 4.4 13
4.4: 30 (a) 1.469 1.296 1.233 1.233 1.233 1.233 1.233 1.233 1.233 1.123 1.123 1.123 1.123 1.123 1.123 1.123 1.123 1.123 1.123 1.123 1.123 0.887 0.887 0.887 0.887 0.887 0.887 0.887 0.887 0.887 (b) 1.415 1.411 1.387 1.276 1.276 1.257 1.257 1.209 1.178 1.140 1.140 1.094 1.094 1.094 1.094 1.094 1.094 1.094 1.094 1.094 1.094 1.094 1.034 1.034 1.034 1.034 1.034 1.034 1.034 1.034 (c) 1.416 1.340 1.321 1.311 1.286 1.271 1.255 1.236 1.236 1.215 1.190 1.190 1.162 1.128 1.128 1.128 1.128 1.128 1.128 1.128 1.087 1.087 1.087 1.087 1.087 1.087 1.087 1.036 1.036 1.036 14
4.3 4.7 4.8 4.9 1 NY:5 ; [1.469] 1 E00193 E00194 E00195 2 E00223 E00224 E00225 2 NY:23 ; [1.233] 3 E00127 E00128 E00129 3 NY:18 ; [1.233] 4 E00563 E00564 E00565 4.7: 4.7 1,2 NY:5 1.469 1 3 NY:23 1.233 2 4 NY:18 1.233 3 1 2 3 15
1 NY:5 ; [1.094] 1 W01394 W01395 W01396 2 W04717 W04718 W04719 2 NY:18 ; [1.415] 3 W00400 W00401 W00402 3 NY:23 ; [1.415] 4 W00470 W00471 W00472 4.8: 4.8 1,2 NY:5 1.094 1 3 NY:18 1.415 2 4 NY:23 1.415 3 1 5 2 3 16
1 NY:5 ; [1.286] 1 S02204 S02205 S02206 2 S02364 S02365 S02366 2 NY:18 ; [1.416] 3 S00337 S00338 S00339 3 NY:23 ; [1.311] 4 S05106 S05107 S05108 4.9: 4.9 1,2 NY:5 1.286 1 3 NY:18 1.416 2 4 NY:23 1.311 3 1 2 3 17
18
5 5.1 4 1. 2. 1 NY:18 [ 1.469] 2 5.2 1 NY:18 E00013 E00564 19
2009 5 2 NY:23 4.8 (W00472) 4.9 (S05107) S05108 Web HP(http://www.torican.jp/) 20
6 6.1 12,047 3.1 4.2 ( A) ( B) 6.2 6.3 6.2 12,047 3.1 ID 6.1 21
ID E00062 E00063 E00069 E00128 E00129 E00130 6.1: 6.3 6.1 6.1: [ ] [ ] [ ] 662 417 245 5,028 4,203 825 6,357 5,900 457 12,047 10,520 1,527 6.4 2 A B A 1. 22
2. 3. B 1. 3 D 1, D 2, D 3 4.2.1 2. 3 3. 3 1. 4. 5. 3 6.2 1.233 E00127 E00129 B 3. E00127 E00128 E00129 6.2: A B 6.5 6.3 6.4 3 1 23
6.3: A 6.4: B 6.3,6.4 1.645-6.055 0.35 21 6.3,6.4 24
6.4 6.2 6.2: (a) A (b) B [ ](%) [ ] [ ](%) [ ] 1.645 1.295 23(49%) 24 182(54%) 156 1.295 0.945 25(50%) 25 48(35%) 88 0.945 0.595 20(47%) 23 40(29%) 97 0.595 0.245 0( 0%) 0 0( 0%) 0-0.455-0.805 0( 0%) 0 0( 0%) 0-0.805-1.155 1( 7%) 14 8(21%) 30-1.155-1.505 0( 0%) 0 0( 0%) 0-2.905-3.255 0( 0%) 0 0( 0%) 0-3.255-3.605 0( 0%) 0 2(18%) 9-3.605-3.955 0( 0%) 0 0( 0%) 0-3.955-4.305 0( 0%) 0 0( 0%) 1-4.305-4.655 0( 0%) 0 0( 0%) 0-5.705-6.055 0( 0%) 0 0( 0%) 0 1.645 0.595 A -0.805-1.155 1 B -0.805-1.155 8-3.255-3.605 2 A 6.5 6.6 25
6.5: A 6.6: B 6.5,6.6 1.590-6.110 0.35 21 6.5,6.6 6.3 26
6.3: (a) A (b) B [ ](%) [ ] [ ](%) [ ] 1.590 1.240 72(26%) 209 293(19%) 1233 1.240 0.890 166(20%) 683 183(16%) 967 0.890 0.540 9(21%) 33 323(19%) 1409 0.540 0.190 2( 8%) 22 67(12%) 474 0.190-0.160 0( 0%) 0 0( 0%) 0-0.160-0.510 0( 0%) 3 4( 6%) 60-0.510-0.860 0( 0%) 0 1(20%) 4-0.860-1.210 0( 0%) 0 1(17%) 5-1.210-1.560 0( 0%) 0 0( 0%) 0-5.760-6.110 0( 0%) 0 0( 0%) 0 6.3 A 1.240 0.890 1.590 1.240 7 1.240 0.890 54 A B A 6.7 6.8 27
6.7: A 6.8: B 6.7,6.8 1.591-6.109 0.35 21 6.7,6.8 6.4 28
6.4: (a) A (b) B [ ](%) [ ] [ ](%) [ ] 1.591 1.241 36( 9%) 360 195( 8%) 2379 1.241 0.891 69(12%) 518 178(10%) 1673 0.891 0.541 31( 8%) 373 112( 7%) 1579 0.541 0.191 3( 3%) 101 9( 4%) 217 0.191-0.159 0( 0%) 0 0( 0%) 0-0.159-0.509 0( 0%) 4 2(25%) 6-0.509-0.859 0( 0%) 0 0( 0%) 2-0.859-1.209 0( 0%) 1 0( 0%) 1-1.209-1.559 0( 0%) 0 0( 0%) 0-5.759-6.109 0( 0%) 0 0( 0%) 0 A B 6.6 6.2 6.3 6.4 6.2 1 6.3 6.4 1.295 0.595 1.590 0.540 1.591 0.541 29
7 7.1 N E00212 E00212 E00213 7.1: 7.1 E00212 E00213 30
8 12,047 (444 ) 2,894 (6,348 ) P 31
4 1 3 C 32
[1],, :,., pp.79-86, 2007. [2],,, :, 16, pp.246-249, 2010. [3] :,,, 2007. [4],,, :, 10, pp.345-348, 2004. [5],,, :,,, AS-5-2, pp.s-51-52, 2007. [6] :,,, 2008. [7],,,,,,, :,, 1997. [8] Robertson, S. E., Walker, S., Jones, S., Hancock-Beaulieu, M. M., and Gatford, M.: Okapi at TREC 3, Proc. of the 3rd Text REtrieval Conference, 1994. 33