UTF-8 Sampler

UTF-8 is an ASCII-preserving encoding method for Unicode (ISO 10646), the Universal Character Set (UCS). The UCS encodes most of the world's writing systems in a single character set, allowing you to mix languages and scripts within a document without needing any tricks for switching character sets. This web page is encoded directly in UTF-8.

Kermit 95 can display UTF-8 plain text in Windows NT, XP, or 2000 when using a monospace Unicode font like Lucida Console or Courier New. The forthcoming GUI version of Kermit 95 will be able to display it too, even in Windows 95, 98, and ME. C-Kermit 7.0 and later can handle it too, if you have a Unicode display. As many languages as are representable in your font can be seen on the screen at the same time.

This, however, is a Web page. Some Web browsers can handle UTF-8, some can't. And those that can might not have a sufficiently populated font to work with. CLICK HERE for a survey of Unicode fonts.

First, the Euro symbol: €.

From the Anglo-Saxon Rune Poem (Rune version):

ᚠᛇᚻ᛫ᛒᛦᚦ᛫ᚠᚱᚩᚠᚢᚱ᛫ᚠᛁᚱᚪ᛫ᚷᛖᚻᚹᛦᛚᚳᚢᛗ
ᛋᚳᛖᚪᛚ᛫ᚦᛖᚪᚻ᛫ᛗᚪᚾᚾᚪ᛫ᚷᛖᚻᚹᛦᛚᚳ᛫ᛗᛁᚳᛚᚢᚾ᛫ᚻᛦᛏ᛫ᛞᚫᛚᚪᚾ
ᚷᛁᚠ᛫ᚻᛖ᛫ᚹᛁᛚᛖ᛫ᚠᚩᚱ᛫ᛞᚱᛁᚻᛏᚾᛖ᛫ᛞᚩᛗᛖᛋ᛫ᚻᛚᛇᛏᚪᚾ᛬

From Laȝamon's Brut (The Chronicles of England, Middle English, West Midlands):

An preost wes on leoden, Laȝamon was ihoten
He wes Leovenaðes sone -- liðe him be Drihten.
He wonede at Ernleȝe at æðelen are chirechen,
Uppen Sevarne staþe, sel þar him þuhte,
Onfest Radestone, þer he bock radde.

From the Tagelied of Wolfram von Eschenbach (Middle High German):

Sîne klâwen durh die wolken sint geslagen,
er stîget ûf mit grôzer kraft,
ich sih in grâwen tägelîch als er wil tagen,
den tac, der im geselleschaft
erwenden wil, dem werden man,
den ich mit sorgen în verliez.
ich bringe in hinnen, ob ich kan.
sîn vil manegiu tugent michz leisten hiez.

Some lines of Odysseus Elytis (Greek):

Τη γλώσσα μου έδωσαν ελληνική
το σπίτι φτωχικό στις αμμουδιές του Ομήρου.
Μονάχη έγνοια η γλώσσα μου στις αμμουδιές του Ομήρου.

από το Άξιον Εστί
του Οδυσσέα Ελύτη

A stanza of Pushkin's Bronze Horseman (Russian):

На берегу пустынных волн
Стоял он, дум великих полн,
И вдаль глядел. Пред ним широко
Река неслася; бедный чёлн
По ней стремился одиноко.
По мшистым, топким берегам
Чернели избы здесь и там,
Приют убогого чухонца;
И лес, неведомый лучам
В тумане спрятанного солнца,
Кругом шумел.

Šota Rustaveli's Veṗxis Ṭq̇aosani, ̣︡Th, The Knight in the Tiger's Skin (Georgian):

ვეპხის ტყაოსანი შოთა რუსთაველი

ღმერთსი შემვედრე, ნუთუ კვლა დამხსნას სოფლისა შრომასა, ცეცხლს, წყალსა და მიწასა, ჰაერთა თანა მრომასა; მომცნეს ფრთენი და აღვფრინდე, მივჰხვდე მას ჩემსა ნდომასა, დღისით და ღამით ვჰხედვიდე მზისა ელვათა კრთომაასა.

And from the sublime to the ridiculous, here is a certain phrase in an assortment of languages:

Greek: Μπορώ να φάω σπασμένα γυαλιά χωρίς να πάθω τίποτα.
Sanskrit: (NEEDED)
Etruscan: (NEEDED)
Latin: Vitrum edere possum; mihi non nocet.
Esperanto: Mi povas manĝi vitron, ĝi ne damaĝas min.
French: Je peux manger du verre, cela ne me fait pas mal.
Provençal: Pòdi manjar de veire, me nafrariá pas.
Québécois: J'peux bouffer d'la vitre, ça m'fa pas mal.
Walloon: Dji pou magnî do vêre, çoula m' freut nén må.
Champenois: (NEEDED)
Lorrain: (NEEDED)
Picard: (NEEDED)
Corsican: (NEEDED)
Occitan: (NEEDED)
Catalan: Puc menjar vidre que no em fa mal.
Spanish: Puedo comer vidrio, no me hace daño.
Basque: Kristala jan dezaket, ez dit minik ematen.
Aragones: Puedo minchar beire, no me'n fa mal .
Galician: Eu podo xantar cristais e non cortarme.
Portuguese: Posso comer vidro, não me faz mal.
Brazilian Portuguese: Consigo comer vidro. Não me machuca.
Cabo Verde Creole: M' podê cumê vidru, ca ta maguâ-m'.
Papiamentu: (NEEDED)
Italian: Posso mangiare il vetro e non mi fa male.
Roman: Me posso magna' er vetro, e nun me fa male.
Sicilian: Puotsu mangiari u vitru, nun mi fa mali.
Milanese: Sôn bôn de magnà el véder, el me fa minga mal.
Venetian: Mi posso magnare el vetro, no'l me fa mae.
Rheto-Romance: (NEEDED)
Romanian: Pot să mănânc sticlă și ea nu mă rănește.
Pictish: (NEEDED)
Breton: (NEEDED)
Cornish: Mý a yl dybry gwéder hag éf ny wra ow ankenya.
Welsh: Dw i'n gallu bwyta gwydr, dwy e ddim yn gwneud dolur i mi.
Irish: Tá mé in ann gloine a ithe; Ní chuireann sé isteach nó amach orm.
Scottish Gaelic: S urrainn dhomh gloinne ithe; cha ghoirtich i mi.
Anglo-Saxon: Ic mæg glæs eotan ond hit hearmiað me ne.
Middle English: Ich canne glas eten and hit hirtiþ me nouȝt.
English: I can eat glass and it doesn't hurt me.
Norwegian (Nynorsk): Eg kan eta glas utan å skada meg.
Norwegian (Bokmål): Jeg kan spise glass uten å skade meg.
Icelandic: Èg get borðað gler, það meiðir mig ekki.
Danish: Jeg kan spise glas, det gør ikke ondt på mig.
Soenderjysk: Æ ka æe glass uhen at det go mæ naue.
Frisian: Ik kin glês ite, it docht me net sear.
Dutch: Ik kan glas eten. Het doet me geen pijn.
Afrikaans: Ek kan glas eet, maar dit maak my nie seer nie.
German: Ich kann Glas essen, ohne mir weh zu tun.
Lëtzebuergescht: Ech kan Glas iessen, daat deet mir nët wei.
Schwäbisch: I kå Glas frässa, ond des macht mr nix!
Bayrisch: I koh Glos esa, und es duard ma ned wei.
Allemannisch: I kaun Gloos essen, es tuat ma ned weh.
Schwyzerdütsch: Ich chan Glaas ässe, das tuet mir nöd weeh.
Swedish: Jag kan äta glas, det skadar mig inte.
Finnish: Pystyn syömään lasia. Se ei koske yhtään.
Hungarian: Meg tudom enni az üveget, nem lesz tőle bajom.
Estonian: Ma vōin klaasi süüa, see ei tee mulle midagi.
Latvian: Es varu ēst stiklu, tas man nekaitē.
Lithuanian: Aš galiu valgyti stiklą ir jis manęs nežeidžia
Croatian: Ja mogu jesti staklo i ne boli me.
Czech: Mohu jíst sklo, neublíží mi.
Slovak: Môžem jesť sklo. Nezraní ma.
Polish: Mogę jeść szkło i mi nie szkodzi.
Albanian: Unë mund të ha qelq dhe nuk më gjen gjë.
Slovenian: Lahko jem steklo, ne da bi mi škodovalo.
Serbian: Mogu jesti staklo bez da mi škodi.
Serbian: Могу јести стакло без да ми шкоди.
Macedonian: Можам да јадам стакло, а не ме штета.
Russian: Я могу есть стекло, это мне не вредит.
Ukrainian: Я можу їсти шкло, й воно мені не пошкодить.
Bulgarian: Аз могъ да ям стъкло, а не ме боли.
Armenian: Կրնամ ապակի ուտել և ինծի անհանգիստ չըներ։
Georgian: მინას ვჭამ და არა მტკივა.
Turkish: Cam yiyebilirim, bana zararı dokunmaz.
Marathi: मी काच खाऊ शकतो, मला ते दुखत नाही.
Hindi: मैं काँच खा सकता हूँ, मुझे उस से कोई पीडा नहीं होती.
Farsi: .من می توانم بدونِ احساس درد شيشه بخورم
Pashto(1): زه شيشه خوړلې شم، هغه ما نه خوږوي
Arabic(1): أنا قادر على أكل الزجاج و هذا لا يؤلمني.
Hebrew(1): אני יכול לאכול זכוכית וזה לא מזיק לי.
Yiddish(1): איך קען עסן גלאָז און עס טוט מיר נישט װײ.
Ladino: (NEEDED)
Twi: Metumi awe tumpan, ɜnyɜ me hwee.
Yoruba(2): Mo lè je̩ dígí, kò ní pa mí lára.
Malay: Saya boleh makan kaca dan ia tidak mencederakan saya.
Tagalog: Kaya kong kumain nang bubog at hindi ako masaktan.
Chamorro: Siña yo' chumocho krestat, ti ha na'lalamen yo'.
Javanese: Aku isa mangan beling tanpa lara.
Vietnamese: Tôi có thể ăn thủy tinh mà không hại gì.
Chinese: 我能吞下玻璃而不伤身体。
Japanese: 私はガラスを食べられます。それは私を傷つけません。
Korean: 나는 유리를 먹을 수 있어요. 그래도 아프지 않아요
Thai: ฉันกินกระจกได้ แต่มันไม่ทำให้ฉันเจ็บ
Lojban: mi kakne le nu citka le blaci .iku'i le se go'i na xrani mi

For testing purposes, some of these are repeated in a monospace font . . .

Euro Symbol: €.
Greek: Μπορώ να φάω σπασμένα να γυαλιά χωρίς να πάθω τίποτα.
Icelandic: Èg get borðað gler, það meiðir mig ekki.
Polish: Mogę jeść szkło, i mi nie szkodzi.
Romanian: Pot să mănânc sticlă și ea nu mă rănește.
Ukrainian: Я можу їсти шкло, й воно мені не пошкодить.
Armenian: Կրնամ ապակի ուտել և ինծի անհանգիստ չըներ։
Georgian: მინას ვჭამ და არა მტკივა.
Hindi: मैं काँच खा सकता हूँ, मुझे उस से कोई पीडा नहीं होती.
Hebrew(1): אני יכול לאכול זכוכית וזה לא מזיק לי.
Arabic(1): أنا قادر على أكل الزجاج و هذا لا يؤلمني.
Japanese: 私はガラスを食べられます。それは私を傷つけません。
Thai: ฉันกินกระจกได้ แต่มันไม่ทำให้ฉันเจ็บ
Notes:
  1. Correct right-to-left display of these languages depends on the capabilities of your browser. The period should appear on the left.
  2. The third word is Latin letter small 'j' followed by small 'e' with U+0329, Combining Vertical Line Below. This displays correctly only if your Unicode font includes the U+0329 glyph and your browser supports combining diacritical marks.

(Additions, corrections, completions, gratefully accepted.)

Other Unicode samplers:

[ Kermit 95 ] [ C-Kermit ] [ Kermit Home ] [ Unicode Fonts ]


UTF-8 Sampler / The Kermit Project / Columbia University / kermit@columbia.edu / 7 Dec 2001