The Linguistic Data Consortium is an open consortium of universities, companies and government research laboratories. It creates, collects and distributes speech and text databases, lexicons, and other resources for linguistics research and development purposes. The University of Pennsylvania is the LDC's host institution. The LDC was founded in 1992 with a grant from the US Defense Advanced Research Projects Agency (DARPA), and is partly supported by grant IRI-9528587 from the Information and Intelligent Systems division of the National Science Foundation. The director of LDC is Mark Liberman and the executive director is Christopher Cieri.
- Corpus linguistics
- Cross-Linguistic Linked Data (CLLD) – project coordinating over a dozen linguistics databases; hosted by the Max Planck Institute (Germany)
- European Language Resources Association (ELRA) – a Luxembourg- and France-based institute with a mission similar to LDC's
- Language Grid – a platform for language resources, operated by NPO Language Grid Association, primarily active in Asia
- Machine translation
- Natural language processing
- Speech technology