AP NEWS TAXONOMY
The AP News Taxonomy is a comprehensive set of standardized vocabularies for describing English–language news content. Terms in the vocabularies cover all aspects of news: subjects, people, places, organizations, and more. When you submit content to the automated AP Tagging Service, the data that comes back is drawn from these vocabularies. Publishers may also choose to integrate the AP News Taxonomy into their own publishing systems to support manual tagging.
- Hierarchical structure for subjects and geographic locations, to enable both broad and narrow searches.
- Relationships between concepts, such as between two people (Parent-Child), or between a person and a group or organization (Athlete-Team).
- Properties of people, places and things—such as an athlete’s uniform number, the latitude and longitude of a geographic location, or the stock ticker symbol for a company.
- Synonyms, acronyms, and spelling variants.
What is included in the taxonomy?
There are five main areas of coverage:
- AP Subject
A wide variety of hierarchically structured topics ranging from broad categories (Crime) to specific concepts (Illegal firearms). Also includes many named events such as Academy Awards and Tour de France. More than 4,000 terms in all.
- AP Geography
Over 2,000 geographic place names arranged hierarchically — continents, world regions, countries, territories, national capitals, major world cities, US states, Canadian provinces, and a large number of US cities and towns.
- AP Organization
Organizations and institutions from a wide variety of sectors: government organizations, non-profits, sports teams, colleges and universities, political and ideological groups, cultural institutions, and more. Over 1,000 different terms.
- AP Person
Celebrities, artists, designers, authors, business leaders, political figures, sports figures, royalty, and other newsmakers known at the global or US national level. Coverage is especially broad for US newsmakers in politics, entertainment and sports, including complete rosters for major professional sports teams, men’s NCAA Division I basketball and football players, all US officeholders at the federal and gubernatorial levels, and all candidates for those offices. More than 90,000 individuals covered.
- AP Company
Over 35,000 publicly–traded companies — including all companies with primary shares trading on any of 70 major global stock exchanges, or trading as ADRs on an American exchange.
What about updates?
The AP News Taxonomy is constantly being updated to capture the latest news and the biggest newsmakers. Whether it’s this week’s IPOs, or the new crop of contestants on American Idol, AP’s taxonomy developers are always working to keep the vocabularies current and relevant.
Subscribers are kept up–to–date in real time—as soon as a change is published, the new version becomes available in the AP News Taxonomy. Numerical versioning keeps the changes organized and in synch with the data provided by the AP Tagging Service. A detailed log of all changes is accessible through a separate API. You can keep track of all changes, or just the ones you care most about.
How does the service work?
The taxonomy is accessed by making calls to an API (Application Programming Interface). The subscriber can request the full set of terms in a given vocabulary, a subset of terms, or information about a particular term. Calls are also available for retrieving deprecated terms, term change logs, and additional information about the structure of the taxonomy. Taxonomy data can be returned in a variety of Semantic-Web compatible formats, including RDF (XML, JSON, or TTL) and NewsML–G2. It can also be returned in HTML format. A comprehensive Developer’s Guide provides all the necessary details.
Use the links below to see a sample of AP Taxonomy data in each of the available formats.
© 2012 The Associated Press. All rights reserved. Terms and conditions apply. See AP.org for details.