ÄØŽŠ

Stats

This page provides information about the letters and phonemes in the 49,070 words in this database. The first table shows statistics about letters, the second about phonemes. Both tables list the total number of occurrences in this database (frequency, also expressed as a percent), and the occurrences at the beginning of a word. The final column is the percent of letter or phoneme occurrences which are at the beginning of a word (column 4 divided by column 2).

This information was useful to me when I was trying to figure out the new orthography for English. For instance, you can see that the phonemes 'dž' and 'tš' ('j/g' and 'ch' in traditional spellings) are relatively infrequent. 'ch' was already a digraph, so it made sense to make 'j' one as well. There are some instances of less frequent phonemes getting their own character while a more frequent one is a digraph (like 'ei' and 'ž'), but it was done to enhance legibility (to avoid spellings like "dzhudzh" for "judge" and "tshørtsh" for "church") and in order to be consistent ('š' is more common than 'ai').

Letters

aLetter Frequency Frequency % Freq as 1st 1st/Total/ttl
a 14433 3.9% 1150 8.0% 1
ä 8173 2.2% 1191 14.6% 2
e 16135 4.4% 1374 8.5% 3
i 46356 12.5% 2663 5.7% 4
y 13360 3.6% 434 3.2% 5
w 7650 2.1% 1179 15.4% 6
u 19682 5.3% 2264 11.5% 7
ø 11985 3.2% 129 1.1% 8
o 7515 2.0% 637 8.5% 9
r 28272 7.6% 3072 10.9% 10
l 17954 4.9% 1608 9.0% 11
n 27594 7.5% 1049 3.8% 12
m 10236 2.8% 2599 25.4% 13
b 6978 1.9% 2903 41.6% 14
p 10027 2.7% 3629 36.2% 15
v 4280 1.2% 723 16.9% 16
f 5549 1.5% 2297 41.4% 17
g 10042 2.7% 1372 13.7% 18
k 15327 4.1% 4440 29.0% 19
d 17125 4.6% 3817 22.3% 20
t 25570 6.9% 2747 10.7% 21
z 11518 3.1% 123 1.1% 22
s 21215 5.7% 5239 24.7% 23
ž 3004 0.8% 5 0.2% 24
š 5876 1.6% 593 10.1% 25
h 3818 1.0% 1833 48.0% 26

Phonemes

'w' and 'y' are labeled as either consonant (K) or vowel (V). I represented the phonemes in the Phonetic English orthography so it would be easier to read in the context of the other data. To see which IPA symbols these phonemes correspond to, visit the home page.

aPhoneme Frequency Frequency % Freq as 1st 1st/Total/ttl
a 8848 2.5% 806 9.1% 1
ai 4382 1.3% 154 3.5% 2
aw 1203 0.3% 190 15.8% 3
ä 8173 2.3% 1191 14.6% 4
e 10063 2.9% 1233 12.3% 5
ei 6072 1.7% 141 2.3% 6
i 35432 10.1% 2663 7.5% 7
y (V) 12677 3.6% 300 2.4% 8
y (K) 683 0.2% 134 19.6% 9
w (V) 3570 1.0% 146 4.1% 10
w (K) 2877 0.8% 1033 35.9% 11
u 19682 5.6% 2264 11.5% 12
ø 11985 3.4% 129 1.1% 13
o 7045 2.0% 621 8.8% 14
oi 470 0.1% 16 3.4% 15
r 28272 8.1% 3072 10.9% 16
l 17954 5.1% 1608 9.0% 17
n 24021 6.9% 1049 4.4% 18
ng 3573 1.0% 0 0.0% 19
m 10236 2.9% 2599 25.4% 20
b 6978 2.0% 2903 41.6% 21
p 10027 2.9% 3629 36.2% 22
v 4280 1.2% 723 16.9% 23
f 5549 1.6% 2297 41.4% 24
g 6469 1.8% 1372 21.2% 25
k 15327 4.4% 4440 29.0% 26
d 15041 4.3% 3355 22.3% 27
1874 0.5% 427 22.8% 28
dh 210 0.1% 35 16.7% 29
t 23521 6.7% 2180 9.3% 30
1274 0.4% 337 26.5% 31
th 775 0.2% 230 29.7% 32
z 11518 3.3% 123 1.1% 33
s 21215 6.1% 5239 24.7% 34
ž 1130 0.3% 5 0.4% 35
š 4602 1.3% 593 12.9% 36
h 2833 0.8% 1833 64.7% 37