Arial Unicode MS
Droid Sans Fallback
Search by simplified Chinese ?
Xiaoma Cidian is a Chinese/English dictionary (simplified characters only) based on CC-CEDICT,
with many additions from various sources.
Your browser must be able to display Unicode fonts in order to view the Chinese characters and the pinyin tones.
If you cannot see the characters, or if you wish to change the chinese characters font, consult the Font Preferences page.
- Check the Discussion Forum for the list of recent updates.
- Nov.12.09 - V. 2.3 beta3. Now you can use accentuated characters when searching using pinyin method.
- Nov.01.09 - V. 2.3 beta1. Moved to www.xiaoma.info and major rewrite. Changed the engine to use UTF-8 everywhere, removed frames, improved reliability.
- August.09 - V. 2.2.15. Improved news annotation reliability (again!). Added RFI as a news source.
- August.09 - V. 2.2.14. Improved news annotation reliability. Improved compound results.
- Nov.20.08 - V. 2.2.13. Fixed once again news annotation. Added articles from the People’s Daily.
- Apr.15.08 - V. 2.2.12. Improved news annotation; should work better now. When searching for a word, but finding no result, it will try again without the specified tones.
- Dec.04.07 - V. 2.2.11. Link to most frequent characters from 3001 to 4000 added. News annotation and discussion forum are still broken.
- Jan.08.07 - V. 2.2.10. Added the display character variants and a link to Unicode.org on the character page.
- Nov.10.06 - V. 2.2.9. 2532 new Chinese compounds, from latest CEDICT. Link to most frequent characters from 2501 to 3000 added. Added 10 news sources for news annotation.
- Jan.03.06 - V. 2.2.8. Happy New Year 2006! 3458 new Chinese compounds, from latest CEDICT distribution added to the dictionary.
- Oct.24.05 - V. 2.2.6. 3700 new and 510 modified Chinese compounds, from latest CEDICT distribution. Now more than 29000 Chinese entries!
- Oct.24.05 - V. 2.2.5. Improved segmentation tool.
- Oct.14.05 - V. 2.2.4. Added more news sources to the News Annotation tool.
- Oct.10.05 - V. 2.2.3. Added a News Annotation page (in test).
- Oct.08.05 - V. 2.2.2. Improvements to the Text Annotation tool.
- Sep.18.05 - V. 2.2.1. Added a link to the list of recent additions to the dictionary. Propose hints for pinyin and translations in the 'enter a definition' page.
- Aug.28.05 - V. 2.2.0. Added text annotation (still in test). Added the front 'enter a definition' message. Added links to random HSK characters.
Xiaoma Cidian is mainly based on the CEDICT projet, currently reborn as CC-CEDICT, Unihan, and various other files found on the Internet.
Below is a list of links to the related files and information. Thanks a lot to Adrian Robert for his list of files
Other links of interest about Chinese
- Search by Chinese character:
Enter one or more Chinese character. Type a single character to be presented with this character information. Type 2 or more characters to search for all compounds containing a sequence of characters. Examples: "马", "词典".
Note: wildcard characters (* or ?) are not available yet for this kind of search...
You can also type a sequence of 4 or 5 hexadecimal digits ('0' to '9' and 'A' to 'F') to search for a character from its Unicode code point: the code point will be replaced by the character.
For instance, if you type "5C0F" or "9a6c" in the input field, it will be replaced respectively by "小" and "马"
(the Unicode code point for the character 小 is U+5C0F).
- Search by pinyin
Enter a valid pinyin expression. It is better to separate each character with a space to avoid ambiguity.
Optionally, use digit 1 to 5 as a suffix or accentuated characters to indicate the tone (where 5 means neutral tone).
To enter "ü", you can type "u:", "v" or directly "ü".
Result: exact matches are displayed first, followed by all words that matches the search.
Examples: "bei3 jing1", "xiǎo mǎ", "nu: er", "beijing", "bei3jing1", "xiǎomǎ".
- Search by English
Enter one or more English word separated by a space. Use the wildcard character "*" to replace any sequence of characters.
Result: exact matches are displayed first, followed by all words that matches the search. For instance, "table" will look
for the word "table", while "*table*" will search for all words containing the expression "table", such as "table" itself, but also "stable", "mutable", "portable", "tables", "tablet" etc.
Examples: "horse", "long march", "*success*".
- English definitions
CL: In each definition, The abreviation CL stands for classifier (量词), also known as the measure word. For instance, 桌子 table; desk; CL:张,套 means that the two classifiers for 桌子 are 张 and 套
- Single character
If the character is known to Xiaoma Cidian, the pinyin pronunciation and the English definitions will be displayed along with other information:
- The Unicode code point (in the form U+XXXX); you can click on the code point to open the Unihan page at Unicode.org.
- The number of strokes of the character
- The HSK level (when available), from 1 to 4.
- The position of the character in the list of most frequent characters sorted from most to least frequent. For instance, "Pos: 3" means that the character is the third most frequent.
- The frequency "score": a figure within a scale from 0 to 1000, where 0 means a very rare character and 1000 the most frequently used character.
- A link to the page of this character in the Wikitionary.
- The radical (and possibly the radical variants).
- The different variants of the character, like the traditional form(s).
- A list of frequent compounds containing this character.
- Links to the list of all compounds starting with and containing the character.
- Compound words
Displays all compounds containing the sequence of characters, sorted by the most to less frequently used.
When searching from English words, the closest matches will be presented first, in order to avoid displaying too many results.
If no match can be found, it will dislay the individual characters and compounds contained in the entered expression.
Displays either the list of all characters matching the search (when searching for a single character pinyin expression),
or all compounds words containing the given compound. In this case, the exact matches are displayed first.
Enter here a chinese text (simplified characters) to annotate it with the corresponding English definitions. The annotation engine
tries to recognize compounds. Note that it can fail in some cases to segment the text correctly, an can only recognize the words contained
in the dictionary.
Chinese daily news annotation
Select a news article from the list and have it annotated.
The articles are fetched from the news RSS feeds of various News providers.
If you would like to have one source to be added to the list, send an email or post to the Discussion Forum!
In some cases, Xiaoma Cidian will not be able to load a given news article. Most of the time, this is due to a change in the provider's
web sites. If this happens, do not hesitate to report the problems so that the fetching engine can be upgraded.
Characters and words tables
- List characters by strokes count.
- List characters difficult to find when searching from radicals. It can be very helpful when searching for a character where the radical is not obvious, but without traversing the whole list of characters.
List the 214 Kanxi radicals (康熙部首), sorted by strokes count.
Click on one radical to list all characters classified under that radical.
List the characters and compounds you can encounter in level 1 (甲 or A), 2 (乙 or B), 3 (丙 or C) and 4 (丁 or D) of the HSK.
Note that these lists are informational only. Xiaoma Cidian is not affiliated in any way to the HSK (Hànyǔ Shuǐpíng kǎoshì 汉语水平考试).
- Characters: all the characters encountered in a level (as single words or as part of compounds).
Does not take into account the different possible meanings or pronunciations:
- Compounds: all compound words for that level. The list of words has an option to display all words, single-char words only or compounds only.
The lists of the most frequent individual characters you may encounter in simplified Mandarin Chinese.
The list is split in chunks of 500 characters, which can be sorted by strokes count, frequency, radical, pinyin.
The characters can also be displayed randomly, which is helpful for learning purposes.
the Chinese Character pages include the list of known variants. The variants can be the traditional form,
a variant for a given definition only (and can be used only in some circumstances), a full variant, etc.