.. _data: How the (real) Duolingo API works: The Data =========================================== There's a lot of data returned :ref:`endpoints `, and many of the endpoints duplicate data to better suit how the apps consume the data. TTS server information ---------------------- The Text-to-Speech engine information is returned in ``/version_info``. The top-level keys ``tts_cdn_url`` and ``tts_voice_configuration`` can be combined to get the audio URL for any individual word or phrase (by phrase id). For the following JSON returned... .. code-block:: javascript "tts_base_url": "https://d7mj4aqfscim2.cloudfront.net/", "tts_cdn_url": "http://static.duolingo.com/", "tts_voice_configuration": { "multi_voices": "{\"dn\": [\"dn\"], \"fr\": [\"fr\", \"fr/mathieu\"], \"en\": [\"en/salli\"], \"pt\": [\"pt\"], \"nb\": [\"nb/liv\"], \"de\": [\"de\"], \"tr\": [\"tr/filiz\"], \"it\": [\"it/carla\"], \"da\": [\"da\"], \"sv\": [\"sv/astrid\"], \"es\": [\"es\"]}", "path": "tts/{voice}/{type}/{id}", "voices": "{\"nb\": \"nb/liv\", \"en\": \"en/salli\", \"tr\": \"tr/filiz\", \"it\": \"it/carla\", \"sv\": \"sv/astrid\"}" } ... we get urls like: * ``https://d7mj4aqfscim2.cloudfront.net/tts/sv/astrid/token/hej`` * ``https://d7mj4aqfscim2.cloudfront.net/tts/fr/matthieu/token/bon`` As of yet I don't know of any voice listed in ``voices`` that is not listed in ``multi_voices``, so it may be safe to ignore the ``voices`` key. Dictionary ---------- As described in :ref:`endpoints `, the dictionary API exists on a different server than the other endpoints. Its URI can be found in ``/version_info`` as well, as ``dict_base_url``. Calendar -------- Users' intra-language practice calendar, and their overall calendar, are accessible over the ``users/show?id=`` endpoint. It does not require login to get a user's calendars. The user's overall calendar is the top-level key ``calendar``, but their language-specific calendar is stored as ``calendar`` under the ``language_data[language]`` key. More information on the ``language_data`` block is below. Here's an example user calendar. The improvement is the number of points earned for the practice, etc., and the datetime is the millisecond. I find it worthy of note that Duolingo returns the millisecond, while only returning it with accuracy in the thousands. .. code-block:: javascript calendar: [ { improvement: 10, datetime: 1435978875000 }, { improvement: 10, datetime: 1435979114000 }, { improvement: 10, datetime: 1435979472000 }, { improvement: 10, datetime: 1435979728000 }, { improvement: 10, datetime: 1435980180000 }, ... ] Certificates ------------ A user's certificates -- the extended tests that cost 25 lingots -- are available (without authentication) through the ``/users/show?id=`` endpoint. .. code-block:: javascript "certificates": [ { "datetime": "\n\n\n\n\n\n\n\n1 month ago\n\n", "id": "abcde", "language": "de", "language_string": "German", "score": 2.09 }, ... ] The datetime is a newline-padded string. ``language_data`` ----------------- Language data is a behemoth. It's gigantic, it's horrifying. I love it. ``Language_data`` is a field in ``users/show`` that stores essentially everything the app might want to know about the user's progress in a language. Only one key is present in ``language_data`` at a time -- their current learning language. Other languages must be set as the current using the ``/me/current_langauge`` endpoint before its ``language_data`` can be retrieved. The following data is some of, but not all of, what is stored inside a ``langauge_data`` block: * ``streak``: the user's current streak in days for that language. * ``langauge_string``: the string of the language being learned. ex, French * ``level_progress``: the current number of XP earned in the current level * ``fluency_score``: a float containing the fluency of the user. * ``level`` and ``next_level``: integers with the user's current and next levels * ``notify_time``: the time, in minutes, that a user should be notified to practice at. This is stored relative to the user's current timezone as returned by the ``users/show`` endpoint. * ``points_ranking_data``: a list containing user objects of the user's friends, ranked by points. More on user objects below. * ``num_skills_learned``: the number of skills in the current langauge learned. * ``level_left``: XP until the user levels up. * ``tracking_properties``: data about the user's tree. * ``next_lesson``: the next lesson the user has to learn. Contains the URL, which can be used with the skills API. * ``skills``: a list of skill objects that the user has learned. * ``bonus_skills``: a list of bonus skill objects that the user has learned. * ``points``: integer with the user's current XP Users ----- Duolingo uses the same user object wherever users appear in its API -- namely, the ``points_ranking_data`` list and ``points_ranking_data_dict`` dict containing the data on each user with their ID as a key. .. code-block:: javascript { username: "me", language_string: "French", points_data: { languages: [ ], total: 10000 }, avatar: "https://s3.amazonaws.com/duolingo-images/avatar/default_2", language: "fr", fullname: null, id: 1234, rank: 1, self: true } Skills ------ A skill object is either gotten from the ``skills`` list inside of ``langauge_data`` or from the ``skills`` endpoint.