Related projects

Archives

Norwegian Language Resources Inventory (draft)

This preliminary overview of digital language resources and tools in Norway was collected by questionnaire, with the support of the RCN through the NO-CLARIN preparatory project.

1. The NHH Termbase (NHH-T)

Type Multilingual terminology database
Size 1600 termbase entries
Languages Norwegian, English
Rightholders NHH
Anticipated access policy Public and free
Anticipated reuse policy Public and free
Anticipated location NHH
Effort needed (a) technical (b) nontechnical a) 4 pw b) 4 pw
Rationale for selection Updated and covering central concepts in microeconomics and other economic-administrative domains
Present usage Regular use for educational purposes at NHH, formal public launch autumn 2010, large potential target group
Similar resources or cooperations EuroTermBank (LV), Rikstermbanken (SE), DanTerm (DA), IATE (EU)
Data or tool data

2. KB-N (Kunnskapsbank for norsk økonomisk-administrativt domene)

Type Multilingual terminology database for the business and economics domains
Size 8467 termbase entries
Languages Norwegian, English
Rightholders NHH, Uni Digital
Anticipated access policy Public and free
Anticipated reuse policy Public and free
Anticipated location NHH
Effort needed (a) technical (b) nontechnical a) 8 pw b) 8 pw, need for updating/quality check
Rationale for selection Relevant for research and development due to wide scope of concepts in economic-administrative domains
Present usage Currently used by about ten researchers but has a much larger potential target group
Similar resources or cooperations EuroTermBank (LV), Rikstermbanken (SE), DanTerm (DA), IATE (EU)
Data or tool data

3. The NOT database (NOT-basen, Norsk termbank)

Type Multilingual terminology database
Size 30 521 termbase entries
Languages Norwegian, English
Rightholders Uni Digital
Anticipated access policy Public and free
Anticipated reuse policy Public and free
Anticipated location NHH
Effort needed (a) technical (b) nontechnical a) 8 pw b) 16 pw
Rationale for selection Very wide coverage of domains and concepts including petroleum sector, not updated
Present usage Currently used by about ten researchers but has a much larger potential target group
Similar resources or cooperations EuroTermBank (LV), Rikstermbanken (SE), DanTerm (DA), IATE (EU)
Data or tool data

4. The UHR database (UHR-basen, Universitets- og høyskolerådets termbase)

Type Multilingual terminology database for administration in higher education
Size 1000 termbase entries
Languages Norwegian, English
Rightholders UHR
Anticipated access policy Public and free
Anticipated reuse policy n/a
Anticipated location UHR
Effort needed (a) technical (b) nontechnical a) 4 pw b) 4 pw
Rationale for selection Fully updated termbase covering central administrative concepts related to higher education
Present usage Regular use for administrative purposes, some hundreds of users
Similar resources or cooperations EuroTermBank (LV), Rikstermbanken (SE), DanTerm (DA), IATE (EU)
Data or tool data

5. The RTT-material (RTT-materialet, Rådet for teknisk terminologi)

Type Multilingual terminology database for technical domains
Size 48,314 termbase entries
Languages Mainly Norwegian, English, French
Rightholders Fagbokforlaget
Anticipated access policy n/a
Anticipated reuse policy n/a
Anticipated location n/a
Effort needed (a) technical (b) nontechnical a) 8 pw b) 16 pw
Rationale for selection Wide coverage of technical domains, not updated
Present usage Currently a small user group but a wider target group
Similar resources or cooperations EuroTermBank (LV), Rikstermbanken (SE), DanTerm (DA), IATE (EU)
Data or tool data

6. The EEA-EU database (EØS-EU-basen, EØS-sekretariatets terminologidatabase og norske oversettelser av rettsakter innlemmet i EØS-avtalen)

Type Multilingual terminology and translation database related to EEA and EU
Size 36,776 termbase entries
Languages Norwegian, English, French
Rightholders EØS-sekretariatet, part of EØS/EFTA-seksjonen i Europaavdelingen, UD
Anticipated access policy n/a
Anticipated reuse policy n/a
Anticipated location EØS-sekretariatet, part of EØS/EFTA-seksjonen i Europaavdelingen, UD
Effort needed (a) technical (b) nontechnical a) 8 pw b) 8 pw
Rationale for selection Continuously updated, high-quality termbase, inclusion must be negotiated with EØS-sekretariatet
Present usage Regular use for translation in government and public administration, hundreds of external users, continuously updated
Similar resources or cooperations EuroTermBank (LV), Rikstermbanken (SE), DanTerm (DA), IATE (EU)
Data or tool data

7. SNORRE (Standard Norges termbase)

Type Multilingual terminology database
Size n/a
Languages Norwegian, English
Rightholders Standard Norge
Anticipated access policy n/a
Anticipated reuse policy n/a
Anticipated location Standard Norge
Effort needed (a) technical (b) nontechnical a) 16 pw b) 8 pw
Rationale for selection Termbase being updated in ongoing project
Present usage New terminology resource, launch planned in autumn 2010
Similar resources or cooperations EuroTermBank (LV), Rikstermbanken (SE), DanTerm (DA), IATE (EU)
Data or tool data

8. The KRIPOS database (KRIPOS-basen)

Type Multilingual terminology database for policing purposes
Size n/a
Languages Norwegian, English, French
Rightholders KRIPOS
Anticipated access policy n/a
Anticipated reuse policy n/a
Anticipated location KRIPOS, language section
Effort needed (a) technical (b) nontechnical a) 16 pw b) 8 pw
Rationale for selection Continuously updated, high-quality termbase, inclusion must be negotiated with KRIPOS
Present usage Regular internal use for policing purposes
Similar resources or cooperations EuroTermBank (LV), Rikstermbanken (SE), DanTerm (DA), IATE (EU)
Data or tool data

9. NRK corpus

Type Speech and text corpus, phonetic transcription, lexicon. ~23 hours read and spontaneous speech, 1.5 M words of news texts (subtitles)
Size Ca. 23 hours of speech, 1.5 M words
Languages Norwegian
Rightholders NRK, SINTEF
Anticipated access policy Free for research purpose
Anticipated reuse policy Restricted
Anticipated location SINTEF
Effort needed (a) technical (b) nontechnical a) 4 pw b) 6 pw
Rationale for selection Limited availability of particularly spontaneous speech in Norwegian, necessary for robust speech recognition development.
Present usage Development of automatic speech recognition systems
Similar resources or cooperations NST database (University of Bergen), EUROM.0 (Norwegian University of Science and Technology, EUROM.1 (Norwegian University of Science and Technology)
Data or tool data

10. NST database enhancements

Type Speech and text corpus
Size n/a
Languages Norwegian
Rightholders Consortium of Norsk Språkråd, IBM, Norwegian University of Science and Technology, University of Bergen, University of Oslo
Anticipated access policy See NST database
Anticipated reuse policy See NST database
Anticipated location Norwegian University of Science and Technology
Effort needed (a) technical (b) nontechnical a) 3 pw b) 5 pw
Rationale for selection The performed enhancements/corrections of the NST database saves workload for potential users of the NST database
Present usage Development of speech recognition systems
Similar resources or cooperations NST database (University of Bergen)
Data or tool data

11. The Place Name Archive Hordaland (Stadnamn i Hordaland)

Type Place name database with home names from the county of Hordaland, spoken and written.
Size Ca. 250,000 names
Languages Norwegian
Rightholders Department of Linguistic, Literary and Aesthetic studies, University of Bergen
Anticipated access policy Public and free for both research and public purposes
Anticipated reuse policy Public and free
Anticipated location University of Bergen
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection Comprehensive database with local place names in Hordaland.
Present usage n/a
Similar resources or cooperations Place name archives in Oslo, Trøndelag and Sogn og Fjordande county (University of Oslo, Norwegian University of Science and Technology, SandFj Fylkesarkiv).
Data or tool data

12. Typology of Norwegian Tonal Accents/Norwegian Tonal Accent Database (Norsk tonelagstypologi)

Type Speech database, recordings of scripted frame utterances with a target words representing realizations of the Norwegian tonal accents in different dialect systematized for accent, structure of stressed syllable, syllable count, structure of phonological word, position in phrase, etc. Originates in a project funded by the Norwegian Research Council 2000-2002: Norsk tonelagstypologi. 20,595 individual sound files from 116 recordings, each of one speaker. Each recording consists of the same set of 69 short utterances, read twice. Most of the recordings are split into separate files and organized in a Filemaker database.
Size 6.7 GB, 20,595 sound files
Languages 30 different Norwegian dialects
Rightholders Gjert Kristoffersen, University of Bergen
Anticipated access policy Free for research purposes
Anticipated reuse policy Free for research purposes
Anticipated location University of Bergen
Effort needed (a) technical (b) nontechnical 8pw. If opting for the Imdi metadata standard and the Elan annotation tool, the original 'unsplit' recordings should be stored as single Elan files. Annotations will have to be added to these files utterance by utterance. All recordings contain the same utterances (although not necessarily in the same order), possibly facilitating this task. An Imdi template based on one of the informants already exists. Since most of the metadata are already defined in the Filemaker database, this should help speed up the remaining work.
Rationale for selection Of interest for linguists working with linguistic tone and intonation. There is a small international community interested in Scandinavian tonal accents. Use of the resource will probably presuppose a certain knowledge of a Scandinavian language, but this will to a certain extent also depend on the level of detail of the linguistic metadata provided. These are not sufficient today in this respect.
Present usage Have mostly been used by the participants of the original project, but other colleagues have from time to time also used parts of the data
Similar resources or cooperations None, but the databases built by the Swedish project Swedia contains data that to a certian extent are comparable to our data.
Data or tool data

13. Light stressed syllables (Jamvektsbasen)

Type Speech database, recordings of scripted frame utterances with a target words representing realizations of stressed syllables in different Norwegian and Swedish dialect, most of them where light stressed syllables have been preserved from Old Norse, systematized for tonal accent, structure of stressed syllable, syllable count, structure of phonological word, etc. Most of the recordings are split into separate files and organized in a Filemaker database.
Size Ca. 2 GB
Languages Norwegian dialects from North Gudbrandsdal, Tinn, Oppdal. Dalarna Swedish: Älvdalen, Våmhus, Vinäs, Skattungbyn, Östre Mora, Sollerön
Rightholders Gjert Kristoffersen, University of Bergen
Anticipated access policy Free for research purposes
Anticipated reuse policy Free for research purposes
Anticipated location University of Bergen
Effort needed (a) technical (b) nontechnical 8 pw. The three most recent recordings of three Swedish dialects are stored as ‘unsplit', annotated Elan-files. All other files are yet to be annotated.
Rationale for selection I tillegg til forskere som arbeider med trykkrealisasjon i germansk, er basen også av interesse for forskere som jobber med germanske tonelag, ikke minst fra en historisk-komparativ synsvinkel.
Present usage Materialet har hittil bare vært brukt av eier
Similar resources or cooperations None. Concerning collaborations, see pt. 10
Data or tool data

14. The Dialect Collection at the University of Bergen

Type Audio recordings/samples of dialects (mainly filed in analogue media). Digitising in progress. Descriptions of dialects according to standard questionnaires.
Size 1600 sound recordings/samples on tape. Hours: unknown. Number of ther documents: unknown
Languages Norwegian
Rightholders The dialect collection at the Department of Linguistic, Literary and Aesthetic Studies (LLE) at the University of Bergen.
Anticipated access policy Restricted, but free for research purposes. Most recordings are protected by law. The material is only to be used by researchers.
Anticipated reuse policy Restricted. Material protected by law only to be used for scientific purposes.
Anticipated location The dialect collections at the Department of Linguistic, Literary and Aesthetic Studies (LLE), at the University of Bergen.
Effort needed (a) technical (b) nontechnical The resource is currently being developed, and the project is fully financed. Digitising in progress, the deadline of which is unknown.
Rationale for selection n/a
Present usage The material is used in connection with a research project.
Similar resources or cooperations Dialect archives at other institutions of higher learning in Norway: the University of Oslo, the Norwegian University of Science and Technology, and the University of Tromsø
Data or tool data

15. The Industrial Area Project (Industristadprosjektet)

Type Speech corpus
Size 213 hours of sound recordings, 2100 pages of text. More material will be collected during the project.
Languages Norwegian
Rightholders Department of Linguistic, Literary and Aesthetic Studies (LLE) at the University of Bergen.
Anticipated access policy Restricted, but free for research purposes. Most recordings are protected by law. The material is only to be used by researchers.
Anticipated reuse policy Restricted. Material protected by law only to be used for scientific purposes.
Anticipated location The dialect collections at the Department of Linguistic, Literary and Aesthetic Studies (LLE), at the University of Bergen.
Effort needed (a) technical (b) nontechnical The resource is currently being developed, and the project is fully financed. Digitising in progress, the deadline of which is n/a.
Rationale for selection n/a
Present usage The material is used in connection with a research project.
Similar resources or cooperations Dialect archives at other institutions of higher learning in Norway: the University of Oslo, the Norwegian University of Science and Technology, and the University of Tromsø
Data or tool data

16. Processes of dialect change (Dialektendringsprosessar)

Type Corpus of transcribed interviews
Size 491,621 words (September 1, 2010). Expected size next year: 1.5 M words
Languages Norwegian
Rightholders Dialektendringsprosessar (Helge Sandøy, LLE, University of Bergen).
Anticipated access policy Free for research purposes
Anticipated reuse policy Restricted, but free for researchers
Anticipated location University of Bergen and Uni Digital
Effort needed (a) technical (b) nontechnical a) 1 person years, b) 3 person years. The corpus will hopefully be extended by data from new projects
Rationale for selection An efficient and important tool in research on changes in the Norwegian language
Present usage 10 researchers, currently
Similar resources or cooperations The Text Laboratory, University of Oslo.
Data or tool data

17. Modern import words in the Nordic languages (Moderne importord i språka i Norden, MIN)

Type Corpus of newspapers from certain days in 1975 and 2000 in all Nordic countries. Import words are annotated for etymological source, style, topic, etc.
Size 1.9 M words.
Languages Icelandic, Faroese, Norwegian, Danish, Sweden-Swedish, Finland-Swedish, Finnish
Rightholders Moderne importord i språka i Norden v/ Helge Sandøy
Anticipated access policy Free for public
Anticipated reuse policy Public and free
Anticipated location University of Bergen and Uni Digital
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection The data were collected in order to compare the rate and typology of the usage of import words in the Nordic languages. The corpus can be reused e.g. in order to illustrate language usage in related languages.
Present usage This was a subproject of MIN, and the corpus is used in reports presented in two volumes of the series Moderne importord i språka i Norden (Novus forlag, Oslo).
Similar resources or cooperations none
Data or tool data

18. KIAP Corpus (Cultural Identity in Academic Prose)

Type Corpus of published research articles in economics, linguistics, and medicine
Size 3,150,000 words
Languages English, French, Norwegian
Rightholders Kjersti Fløttum and Uni Digital
Anticipated access policy Free for research purposes
Anticipated reuse policy Research purposes
Anticipated location University of Bergen/Uni Digital
Effort needed (a) technical (b) nontechnical none
Rationale for selection Relevant for the study of differences in academic discourse
Present usage Researchers and PhDs (5 -10). During main project period: 20-30
Similar resources or cooperations University of Grenoble
Data or tool data

19. The Norwegian Spanish Parallel Corpus, NSPC

Type Corpus (parallell translational corpus -unidirectional)
Size 1.5 M words in each language
Languages Norwegian - Spanish
Rightholders Lidun Hareide, University of Bergen
Anticipated access policy Free for research purposes
Anticipated reuse policy Research purposes
Anticipated location Uni Digital
Effort needed (a) technical (b) nontechnical a) 1pm, b) 1pm
Rationale for selection All texts in the Norbok database published in Norwegian and translated into Spanish between 2000 and 2008.
Present usage In process of completion
Similar resources or cooperations The NSPC is built to be roughly comparable to the P-ACTRES English - Spanish Parallel corpus built at the University of León, Spain
Data or tool data

20. Medieval Nordic Text Archive (Menota)

Type Corpus of Medieval Nordic texts, several of which are linguistically annotated
Size Presently 17 texts comprising 923,000 words encoded according to a high philological standard, the archive is expected to grow considerably in the coming years, in part due to the Menotec project (2010-2012)
Languages Medieval Nordic (1100-1500), i.e. Old Icelandic, Old Norwegian, Old Swedish and Old Danish, and also Latin texts of Nordic provenance from the same period
Rightholders The editor(s) of each text, as specified in the header of each XML file, bibliographical usage discussed here: http://www.menota.org/help/bibliographics.page
Anticipated access policy Free
Anticipated reuse policy As specified in § 3 of the access agreement (deposit license), http://www.menota.org/avtaler/depo1-2.html
Anticipated location University of Oslo, Unit for Digital Documentation
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection Menota is presently the major digital archive of freely available Medieval Nordic texts encoded according to a high academic standard
Present usage Based on feedback, users are typically academics (at all levels), exact data for usage not available
Similar resources or cooperations Handrit.is is not a text archive, but it is a catalogue which could be linked to Menota, and vice versa, however, it only covers Old Icelandic and Old Norwegian
Data or tool data

21. Infrastructure for the Exploration of Syntax and Semantics (INESS)

Type E-infrastructure for syntactically (and semantically) annotated corpora, including the first extensive treebank for Norwegian.
Size The treebank for Norwegian will be built in the period until October 2015. The projected size of the gold standard treebank is 500,000 words. The projected size of the automatically annotated treebank is 500 M words. Other languages will also be added.
Languages Norwegian, Sami, German, English and other languages.
Rightholders University of Bergen and Uni Digital.
Anticipated access policy Mostly public and free, but depending on conditions for source texts.
Anticipated reuse policy Mostly public and free, but depending on conditions for source texts.
Anticipated location University of Bergen
Effort needed (a) technical (b) nontechnical The resource is currently being developed, and the project is fully financed for 180 pm.
Rationale for selection Treebanks can be used for developing analyzers for various applications. The infrastructure will provide treebanking support for others.
Present usage Not yet in use.
Similar resources or cooperations No similar resources for Norwegian.
Data or tool infrastructure

22. Corpus of Norwegian as a second language, ASK (Norsk andrespråkskorpus)

Type Text corpus of Norwegian as a second language, searchable by linguistic annotations and informant attributes. The data is collected from Norsk språktest's archives of examination results from foreigners learning Norwegian as a second language.
Size 2000 texts, ca. 600,000 words in total. A control corpus with 200 texts written by people with Norwegian as their mother tongue.
Languages Norwegian
Rightholders University of Bergen
Anticipated access policy Free for research purposes
Anticipated reuse policy Restricted
Anticipated location n/a
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection Useful for L2 studies.
Present usage Intensively used by about 20 researchers at masters and PhD level.
Similar resources or cooperations First corpus in Norway providing this type of language data.
Data or tool data

23. COLA: Corpus oral de lenguaje adolescente

Type Speech corpus with linked transcriptions. Teenage talk from Spanish speaking cities.
Size 0.8 M words, 50 hours of audio files
Languages Spanish from Madrid, Buenos Aires and Santiago de Chile
Rightholders University of Bergen, Annette Myre Jørgensen
Anticipated access policy Non-commercial research (as agreed with informants)
Anticipated reuse policy Non-commercial research
Anticipated location Uni Digital
Effort needed (a) technical (b) nontechnical Conversion of texts to Corpuscle format: 1 month. Work on user interface: 0,5 month
Rationale for selection Important source for studying Spanish youth language.
Present usage 250 web users in 25 countries worldwide. Important source for studying Spanish youth language. Popular in among researchers of oral language, teachers of Spanish, students of Spanish. Has been used as a basis for 15 MA and 10 PhD theses.
Similar resources or cooperations Can be compared with COLT and UNO.
Data or tool data

24. COLT: Corpus f London teenage language

Type Corpus
Size 0.5 M words, 50 hours of audio files
Languages English
Rightholders University of Bergen.
Anticipated access policy Non-commercial research
Anticipated reuse policy Non-commercial research
Anticipated location Uni Digital or University of Bergen. COLT is presently distributed on the ICAME CD and as a set of 3 CDs with audio. Some of the texts are available as a part of BNC, but they have been further processed in Bergen. The audio files are only available from Bergen.
Effort needed (a) technical (b) nontechnical Conversion of texts to Korpuscle format: 1 month. Work on user interface: 0.5 month
Rationale for selection Important source for studying English youth language.
Present usage We have distributed 25 sets of CDs with transcripts/audio. A wider user group access the corpus through the web interface.
Similar resources or cooperations Can be compared with COLA and UNO.
Data or tool data

25. UNO

Type Corpus
Size 0.2 M words, 30 hours of audio files
Languages Norwegian
Rightholders Probably Kristine Hasund, HIA
Anticipated access policy Non-commercial research
Anticipated reuse policy Non-commercial research
Anticipated location Uni Digital
Effort needed (a) technical (b) nontechnical Conversion of texts to Corpuscle format: 1 month. Work on user interface: 0,5 month
Rationale for selection The first source of this kind for Norwegian (spontaneous dialogue)
Present usage n/a
Similar resources or cooperations Big Brother corpus, Oslo
Data or tool data

26. ICAME

Type Collection of corpora, written, spoken, historical
Size 18 corpora, 14 M words
Languages English
Rightholders The collectors of the various corpora
Anticipated access policy Non-commercial research. Today the material can be distributed on a CD if the user signs the conditions on the order form. We have to renegotiate the policy, this may be different for the different corpora.
Anticipated reuse policy Possibly id.
Anticipated location Uni Digital
Effort needed (a) technical (b) nontechnical Conversion of texts to XML: 9 months. Work on user interface: 1 month
Rationale for selection Standard resource for research on English. Many of the corpora are only available for use with legacy concordance programs like WordSmith. The corpora should be made searchable via a web interface in Corpuscle. The historical ones are particularly valuable since few texts exist as compared to modern texts.
Present usage We have distributed more than 1000 CDs with these corpora. The corpora are very popular among scholars of English.
Similar resources or cooperations Some of these corpora are available to registered users at the University of Lancaster (hence duplication should be avoided through cooperation).
Data or tool data

27. Newspaper corpus

Type Corpus
Size 800 M running words
Languages Norwegian
Rightholders The newspaper publishers. We are allowed to let users search the corpus and show the hits with limited context.
Anticipated access policy Free but conditions to be re-negotiated
Anticipated reuse policy Free but conditions to be re-negotiated
Anticipated location Uni Digital
Effort needed (a) technical (b) nontechnical Conversion and re-tagging of 23 newspapers: 11.5 months. Work on user interface: 1 month
Rationale for selection The largest collection of Norwegian texts available for language studies. Dynamic corpus, extraction of new word forms (unregistered earlier). Distribution of hits by newspaper and year.
Present usage 290 registered users.
Similar resources or cooperations NoWac, Text Laboratory Oslo
Data or tool data

28. Wittgenstein Archives Bergen 5000 pages (WAB 5000)

Type corpus in XML (TEI-P5) and HTML output (different versions) formats, XSLT stylesheets, Web interface
Size More than 2 M words
Languages German and English
Rightholders The Master and Fellows of Trinity College, Cambridge, Bertrand Russell Archives, Ontario (Ts-201a1, Ts-201a2), Oxford University Press, Oxford, University of Bergen, Bergen, Uni Research, Bergen
Anticipated access policy Creative Commons General Public License Attribution, Non-Commercial, Share-Alike version 3 (CCPL BY-NC-SA)
Anticipated reuse policy Creative Commons General Public License Attribution, Non-Commercial, Share-Alike version 3 (CCPL BY-NC-SA)
Anticipated location University of Bergen
Effort needed (a) technical (b) nontechnical a) XSLT programming and Web services: 4 PM, b) guidance of programming, administration, dissemination and communication, incl. Web sites: 4 PM
Rationale for selection One of the most important resources for 20th century philosophy and thought, a study case and test-bed for philology, literary studies and XML/TEI research and applications, multilingual
Present usage n/a
Similar resources or cooperations http://wittgensteinsource.org/ , http://wab.aksis.University of Bergen.no/wab_hw.page/
Data or tool data

29. Korpuscle (Korpuskel)

Type Corpus tool
Size n/a
Languages unknnown
Rightholders Uni Digital
Anticipated access policy LLGPL (Lisp Lesser General Public License, basically public domain)
Anticipated reuse policy LLGPL
Anticipated location Uni Digital, downloadable
Effort needed (a) technical (b) nontechnical 2 person months
Rationale for selection Usability: Handles any corpus that is annotated on a word and/or structural level. Unicode support. Suitable for large corpora (order of magnitude 1 billion tokens and more). Powerful search engine (functionality of Corpus Workbench's query language plus support for multi-valued and set-valued attributes, hierarchical structures with arbitrary nesting, and more), fast query processing using newly developed algorithms based on suffix arrays and finite state automata. Support for user annotations and editing using integrated relational database. Customizable Web interface.
Present usage At present the system is used by several projects/corpora at Uni Digital and University of Bergen: ASK, Dialektendringsprossessar, Norsk Aviskorpus
Similar resources or cooperations Similar tools: Corpus Workbench/CQPWeb. Potential collaborations: The Text Laboratory ved University of Oslo, Språkbanken (Sverige), many potential users internationally
Data or tool tool

30. TCA2 (Translation Corpus Aligner 2)

Type Software to prepare texts for parallel corpora
Size n/a
Languages n/a
Rightholders Uni Digital
Anticipated access policy Free for research purposes
Anticipated reuse policy So far others have been allowed to modify the code for their own purposes.
Anticipated location Uni Digital, downloadable
Effort needed (a) technical (b) nontechnical Depends on which enhancements are desired. Text editing: 0.5 months? Word alignment/term extraction module: 3 months? A web version: 4 months? All the work is technical work.
Rationale for selection Handles pairs of texts that are translations of each other, where sentences have been XML tagged. Alignment is done partly automatic, partly by manual intervention. Automatic alignment assumes sentences are related 1-1, 1-2, 1-0, 2-1, or 0-1. The process is helped by "anchor files" which contains pairs of words/phrases that are more or less translations of each other. The program is based on earlier program by Knut Holfland and a lot of (his) experience with sentence alignment. On the whole users are very satisfied.
Present usage Used by 10 users in 8 projects for 6 language pairs.
Similar resources or cooperations Several command driven aligners exist, but not so many with a GUI.
Data or tool tool

31. IDP (Interactive Dynamic Presentation)

Type XSLT-based software to filter and present XML-TEI-encoded texts in a web page, in user-defined ways
Size n/a
Languages n/a
Rightholders Uni Digital and University of Bergen
Anticipated access policy Online use is open.
Anticipated reuse policy So far others have been allowed to modify the code for their own purposes.
Anticipated location Uni Digital and University of Bergen
Effort needed (a) technical (b) nontechnical 4 person months programming
Rationale for selection not relevant
Present usage n/a
Similar resources or cooperations http://wittgensteinsource.org/ , http://wab.aksis.University of Bergen.no/wab_hw.page/
Data or tool tool

32. PROIEL corpus

Type corpus of classical Bible translations
Size 518,000 words
Languages Ancient Greek, Latin, Classical Armenian, Old Church Slavic, Gothic
Rightholders PROIEL project, IFIKK, Oslo
Anticipated access policy Creative Commons Attribution-Noncommercial-Share Alike 3.0
Anticipated reuse policy Public and free (noncommercial)
Anticipated location University of Oslo
Effort needed (a) technical (b) nontechnical Sources available, much work required to create a proper query interface
Rationale for selection Covers the New Testment and translations, as well as a number of original texts in Latin. Only large corpus covering the classical ancient languages, interest also for Bible scholars
Present usage 294 registered users
Similar resources or cooperations (Small) Latin treebanks at the Perseus project (Tufts, USA) and in Milano, otherwise there are very few resources for these languages
Data or tool data

33. English Resource Grammar (ERG)

Type computational grammar
Size 250,000 lines of code
Languages English
Rightholders DELPH-IN
Anticipated access policy open source (MIT)
Anticipated reuse policy Public and free
Anticipated location CLARINO LAP (on-line use) and repository (for download)
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection The ERG is the largest freely available precision grammar for English, already a point of reference for many
Present usage about one dozen current users world-wide
Similar resources or cooperations ParGram English grammar
Data or tool tool

34. PET

Type parser
Size 60,000 lines of code
Languages language-independent
Rightholders DELPH-IN
Anticipated access policy open source (LGPL)
Anticipated reuse policy Public and free
Anticipated location CLARINO LAP (on-line use) and repository (for download)
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection PET provides the run-time environment for the ERG (and other DELPH-IN grammars)
Present usage many dozens of current users world-wide
Similar resources or cooperations Xerox Linguistic Environment (XLE)
Data or tool tool

35. Linguistic Knowledge Builder (LKB)

Type grammar engineering toolkit
Size 160,000 lines of code
Languages language-independent
Rightholders DELPH-IN
Anticipated access policy open source (MIT)
Anticipated reuse policy Public and free
Anticipated location CLARINO repository (for download)
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection the LKB is a very popular tool for unification-based grammar engineering
Present usage several hundreds of current users world-wide, both for teaching and RandD usage
Similar resources or cooperations Xerox Linguistic Environment (XLE)
Data or tool tool

36. Redwoods

Type manually annotated HPSG treebank
Size 250,000 words
Languages English
Rightholders DELPH-IN project
Anticipated access policy open source (MIT)
Anticipated reuse policy Public and free
Anticipated location CLARINO LAP (on-line use) and repository (for download)
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection largest available HPSG treebank
Present usage about one dozen current users world-wide
Similar resources or cooperations n/a
Data or tool data

37. WikiWoods

Type automatically annotated HPSG treebank based on Wikipedia
Size 900 M words
Languages English
Rightholders DELPH-IN project
Anticipated access policy open source (MIT)
Anticipated reuse policy Public and free
Anticipated location CLARINO LAP (on-line use) and repository (for download)
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection rich syntacto-semantic annotations for the complete English Wikipedia
Present usage a handful of (power) users world-wide
Similar resources or cooperations n/a
Data or tool data

38. MaltXLE

Type Architecture for 'stacked' dependency parsing with LFG features
Size n/a
Languages English and German
Rightholders Lilja Øvrelid
Anticipated access policy Public and free
Anticipated reuse policy Public and free
Anticipated location CLARINO LAP (on-line use)
Effort needed (a) technical (b) nontechnical a) 2 pw b) 2 pw
Rationale for selection stacked parsing increases domain robustness, deeper linguistic features useful for applications
Present usage Internal (in-house) use only
Similar resources or cooperations Presupposes MaltParser and XLE
Data or tool tool

39. Leksikografisk bokmålskorpus (LBK)

Type text corpus
Size 40 mill words
Languages Norwegian
Rightholders Dept of Linguistic and Nordic studies at University of Oslo
Anticipated access policy Licenced
Anticipated reuse policy Restricted
Anticipated location The Text Laboratory, ILN, University of Oslo
Effort needed (a) technical (b) nontechnical none
Rationale for selection Coverage of text types according to time use of reading by SSB
Present usage n/a
Similar resources or cooperations Not in Norway or over Norwegian language
Data or tool data

40. The French Newspaper Corpus

Type Text corpus. Part-of-speech tagged newspaper texts in French. Avalaible through web interface. Search by variables such as part of speech, suffix etc.
Size 115 M words
Languages French
Rightholders Developer: The Text Laboratory. Texts: LCD and ACL.
Anticipated access policy Accessible for users from University of Oslo
Anticipated reuse policy Restricted
Anticipated location The Text Laboratory, ILN, University of Oslo
Effort needed (a) technical (b) nontechnical none
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations University of Grenoble
Data or tool data

41. The KAL Corpus

Type Text corpus. 3300 texts written by pupils. Marks and and other background data is available. Search by a range of variables. Annotation: http://omilia.uio.no/kal/filer/tekn_info.html
Size 3300 texts
Languages Norwegian
Rightholders Annotated corpus: The Text Laboratory. Pupil texts: The project "Kvalitetssikring av læringsutbyttet i norsk skriftlig": http://prosjekt.hihm.no/r97-kal/
Anticipated access policy Free for research purposes
Anticipated reuse policy Restricted
Anticipated location The Text Laboratory, ILN, University of Oslo
Effort needed (a) technical (b) nontechnical none
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

42. LOGON Tourist Corpus

Type Text corpus. Parallell aligned tourist information texts in Norwegian and English.
Size ca 175,000 words
Languages Norwegian and English
Rightholders Developer: The Text Laboratory in cooperation with the LOGON-project
Anticipated access policy Access only for research and development purposes. Association with Språkbanken needs to be examined/determined.
Anticipated reuse policy Restricted
Anticipated location The Text Laboratory, ILN, University of Oslo
Effort needed (a) technical (b) nontechnical none
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

43. NoWaC - Norwegian Web as Corpus

Type Text corpus. Web based corpus for Norwegian bokmål, 700 M words. Constructed through automatic retrieval of documents from the .no domain. The documents are downloaded from the Internet and then processed. POS-tagged.
Size 700 M words
Languages Norwegian bokmål
Rightholders The Text Laboratory/PhD student Emiliano Guevara.
Anticipated access policy Free for research purposes
Anticipated reuse policy Free for research purposes
Anticipated location The Text Laboratory, ILN, University of Oslo
Effort needed (a) technical (b) nontechnical none
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

44. NP-annotated Norwegian corpus

Type Text corpus. Norwegian texts in which all NPs are annotated with information about their form, meaning and discurse relations. Available through web interface. Search by all information marked on the NPs.
Size The resource is under construction and is fully financed.
Languages Norwegian
Rightholders Developer: The Text Laboratory. Text and annotation: Norwegian University of Science and Technology
Anticipated access policy Restricted access, regulated by Norwegian University of Science and Technology (Norwegian University of Science and Technology)
Anticipated reuse policy n/a
Anticipated location Norwegian University of Science and Technology
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

45. The OPUS Corpus

Type Text corpus. Written language from 60 languages. OPUS is a growing collection of translated texts from the web. The OPUS project converts and aligns free online data, adds linguistic annotation, and provides the community with a publicly available parallel corpus. OPUS is based on open source products and the corpus is also delivered as an open content package. Several tools are used to compile the current collection. All pre-processing is done automatically. No manual corrections have been carried out.
Size 30 M words
Languages 60 languages
Rightholders Developer: The Text Laboratory in cooperation with Jörg Tiedemann, University of Groningen
Anticipated access policy Public and free.
Anticipated reuse policy Restricted
Anticipated location University of Groningen
Effort needed (a) technical (b) nontechnical n/a. Currently available
Rationale for selection The main motivation for compiling OPUS is to provide an open source parallel corpus that uses standard encoding formats including linguistic annotation. A public collection of parallel corpora that can freely be used and distributed makes it possible for everyone to run experiments on bitexts and their results can easily be compared.
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

46. The Oslo Corpus of tagged Norwegian texts, Bokmål (Oslokorpuset av taggede, norske tekster, bokmål)

Type Text corpus. Texts from fiction, newspapers/magazines and factual prose. Available through web interface. Search by variables such as genre, part of speech, suffix etc. Tagged with the Oslo-Bergen-tagger.
Size 18.5 M words
Languages Norwegian bokmål
Rightholders Developer: The Text Laboratory.
Anticipated access policy Restricted: access only for research and development purposes. Association with Språkbanken needs to be examined/determined.
Anticipated reuse policy Restricted
Anticipated location The Text Laboratory, ILN, University of Oslo
Effort needed (a) technical (b) nontechnical none
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

47. The Oslo Corpus of tagged Norwegian texts, Nynorsk (Oslokorpuset av taggede, norske tekster, Nynorsk)

Type Text corpus. Texts from fiction, newspapers/magazines and factual prose. Available through web interface. Search by variables such as genre, part of speech, suffix etc. Tagged with the Oslo-Bergen-tagger.
Size 3.8 M words
Languages Norwegian nynorsk
Rightholders Developer: The Text Laboratory.
Anticipated access policy Restricted: access only for research and development purposes. Association with Språkbanken needs to be examined/determined.
Anticipated reuse policy Restricted
Anticipated location The Text Laboratory, ILN, University of Oslo
Effort needed (a) technical (b) nontechnical none
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

48. The Oslo Corpus of Bosnian Texts

Type Text corpus. Bosnian texts from various genres. Available through web interface. Search by variables such as genre, part of speech, suffix etc.
Size 1.5 M words
Languages Bosnian
Rightholders Developer: The Text Laboratory.
Anticipated access policy Restricted: access only for research and development purposes.
Anticipated reuse policy Restricted
Anticipated location The Text Laboratory, ILN, University of Oslo
Effort needed (a) technical (b) nontechnical none
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

49. Oslo Multilingual Corpus

Type Text corpus. Mulitlingual parallell text corpora (subcorpora) with original texts and translations. Available through web interface. Search by variables such as genre, part of speech, suffix etc. SGML-tagged, tagged gramatically with several different taggers.
Size 15.5 mill words
Languages Principally Norwegian, English, French and German, but also smaller corpora with Dutch and Portuguese texts.
Rightholders Developer: The Text Laboratory in cooperation with the SPRIK projekt, ILOS, University of Oslo
Anticipated access policy Restricted: access only for research and development purposes at University of Oslo and University of Bergen.
Anticipated reuse policy Restricted
Anticipated location The Text Laboratory, ILN, University of Oslo
Effort needed (a) technical (b) nontechnical none
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

50. Sami-Norwegian Corpus

Type Text corpus. Sami-Norwegian parallell aligned texts. Available through web interface. Search by variables such as genre, part of speech, suffix etc. Tagged gramatically with the Sami CG-tagger developed at the Center for Sami Language Technology (Senter for samisk språkteknologi), University of Tromsø.
Size Unknown
Languages Sami and Norwegian
Rightholders Developer: The Text Laboratory and Center for Sami Language Technology, University of Tromsø. Text and annotation: Center for Sami Language Technology, University of Tromsø
Anticipated access policy Access regulated by the Center for Sami Language Technology, University of Tromsø.
Anticipated reuse policy n/a
Anticipated location The Text Laboratory, ILN, University of Oslo
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

51. The Sidaama Corpus

Type Text corpus. Sidaama texts from the New Testament. Translated by Kjell Magne Yri.
Size 150,000 words
Languages Sidaama
Rightholders Developer: The Text Laboratory. Texts: Kjell Magne Yri
Anticipated access policy At Kjell Magne Yri's (ILN) disposal
Anticipated reuse policy n/a
Anticipated location The Text Laboratory, ILN, University of Oslo
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

52. The Usenet Corpus

Type Text corpus. Norwegian texts from the no*-hierachy Usenet (newslist web domain) from 1998 to 2002.
Size 140 M words
Languages Norwegian
Rightholders Developer: The Text Laboratory.
Anticipated access policy Public and free. Probably to be made accessible through Språkbanken.
Anticipated reuse policy Public and free
Anticipated location The Text Laboratory, ILN, University of Oslo
Effort needed (a) technical (b) nontechnical n/a. Currently available
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

53. The Big Brother Corpus

Type Speech corpus. Norwegian speech corpus. Ortographically transcribed and linked to video files. Almost all the television broadcasts from the first season of Big Brother in 2001. Spontanous speech, including laughter, crying, yelling, discussions etc. XML-tagged and gramatically tagged by the NoTa-tagger.
Size Ca. 550,000 words
Languages Norwegian
Rightholders Developer: The Text Laboratory.
Anticipated access policy Restricted: access only for research and development purposes. Association with Språkbanken needs to be examined/determined.
Anticipated reuse policy Restricted
Anticipated location The Text Laboratory, ILN, University of Oslo
Effort needed (a) technical (b) nontechnical n/a. Currently available
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

54. The Ruija Corpus

Type Speech corpus. Speech corpus with spoken language from 'kvensk'-speaking areas (1962-2009. Both 'kvensk' and Norwegian speech. The corpus is built after the model of NoTa-Oslo, among others.
Size Ca. 70 interviews
Languages Kven and Norwegian
Rightholders Developer: The Text Laboratory. Material from two projects with project manager Pia Lane, ILN
Anticipated access policy At the LICHEN project's (by researcher Pia Lane, ILN) disposal.
Anticipated reuse policy Restricted
Anticipated location The Text Laboratory, ILN, University of Oslo
Effort needed (a) technical (b) nontechnical none
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

55. Nordic Dialect Corpus (Nordisk dialektkorpus)

Type Speech corpus. Nordic Dialect Corpus is a corpus of Norwegian, Swedish, Danish, Faroese and Övdalian (and soon Icelandic and Finland Swedish) spoken language. It consists of spontaneous speech data from dialects of the North Germanic languages across all of the Nordic countries. The linguistic data in the corpus comes frome a variety of sources, both old and new. It is transcribed and linked to audio and video, has a map function, and can be searched in a large variety of ways. For Norwegian: 100 points of reference (målepunkt) in Norway. 400 informants, each doing a 10 minute interview, participating in a 30 minute conversation.Transcription resembling spoken language, and with Norwegian translation. Phonetic transcription translated by the ScanDiaSyn transliterator, and grammatically tagged by the NoTa-tagger.
Size Approx 2 M words (September 2010). More data will be added
Languages Norwegian dialects, Swedish dialects, Danish dialects, Faroese and Övdalian dialects (and soon Icelandic and Finland Swedish
Rightholders Developer: The Text Laboratory. The Norwegian material is collected in collaboration with Norwegian University of Science and Technology and University of Tromsø. The material from the other Nordic countries are supplied by the respective countries.
Anticipated access policy Restricted: access only for research and development purposes. Association with Språkbanken needs to be examined/determined.
Anticipated reuse policy Restricted
Anticipated location The Text Laboratory, ILN, University of Oslo
Effort needed (a) technical (b) nontechnical none
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

56. Norwegian Corpus of Spoken Language (Norsk talespråkkorpus Oslo)

Type Speech corpus. 166 informants, representative with respect to variables such as gender, age, education and residence. Interviews and conversations. Orthographically transcribed speech linked to audio and video files. Web interface, searchable by text and variables. XML-tagged and grammatically tagged by the NoTa-tagger.
Size Ca. 900,000 words
Languages Norwegian
Rightholders Developer: The Text Laboratory.
Anticipated access policy Restricted: access only for research and development purposes. Association with Språkbanken needs to be examined/determined.
Anticipated reuse policy Restricted
Anticipated location The Text Laboratory, ILN, University of Oslo
Effort needed (a) technical (b) nontechnical none
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

57. The TAUS Corpus

Type Speech corpus. Original audio files and transcripts from the TAUS project. 59 informants are orthographically re-transcribed, and this transcription linked to the audio files. Web interface searchable by text and variables. XML-tagged and grammatically tagged by the NoTa-tagger.
Size Ca. 244,000 words
Languages Norwegian
Rightholders Developer: The Text Laboratory. Material from the TAUS-project.
Anticipated access policy Restricted: access only for research and development purposes. Association with Språkbanken needs to be examined/determined.
Anticipated reuse policy Restricted
Anticipated location The Text Laboratory, ILN, University of Oslo
Effort needed (a) technical (b) nontechnical n/a. Currently available
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

58. The UPUS Corpus

Type Speech corpus. Conversations and interviews with young people from multi-ethnic groups in Oslo. Orthographically transcribed and linked to audio and video files. Web interface, searchable by text and variables. XML-tagged and grammatically tagged by the NoTa-tagger.
Size Unknown. Currently interviews and conversations with 55 adolecents
Languages Norwegian
Rightholders Developer: The Text Laboratory. Material from the UPUS-project.
Anticipated access policy At the UPUS project's disposal. Currently only accessible for research related to the project.
Anticipated reuse policy Restricted
Anticipated location The Text Laboratory, ILN, University of Oslo
Effort needed (a) technical (b) nontechnical n/a. Currently available
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

59. GREI

Type Grammar game/treebank. Morphological and syntactic analysis of sentences in Norwegian bokmål and nynorsk, used for grammar games and syntactic tree construction. Encoded according to the VISL project standard.
Size 750 sentences in Norwegian Bokmål and 750 sentences in Norwegian Nynorsk
Languages Norwegian (Bokmål and Nynorsk)
Rightholders Developer: the VISL-project (at the University of Southern Denmark, Odense). The Text Laboratory is responsible for creating and analysing the Norwegian sentences.
Anticipated access policy The Norwegian sentence analyses are freely accessible. The VISL-project at the University of Southern Denmark, is the rightholder of games and other tools.
Anticipated reuse policy The Norwegian sentence analyses are freely accessible
Anticipated location University of southern Denmark, Odense: beta.visl.sdu.dk
Effort needed (a) technical (b) nontechnical n/a. Currently available
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

60. The Sofie Treebank

Type Treebank. Aligned sentences in nine languages from two chapters of Jostein Gaarder's Sophie's World.
Size Sentences from two chapters in Sophie's World
Languages Danish, Dutch, English, Estonian, Finnish, German, Icelandic, Norwegian and Swedish.
Rightholders Developer: The Text Laboratory in cooperation with the Nordic Treebank Network participants.
Anticipated access policy Restricted: access only for research and development purposes.
Anticipated reuse policy Unresolved. Association with Språkbanken needs to be examined/determined.
Anticipated location The Text Laboratory, ILN, University of Oslo
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

61. The Norwegian Wordbank (Norsk ordbank)

Type Lexical database. Electronical database with lexical base units. Each unit is linked to all its inflectional forms. Base forms from Bokmålsordboka, Nynorskordboka, the IBM glossary and more. Currently updated according to recent changes in ortography/spelling regulations. Words are listed with all inflectional forms.
Size Ca. 150,000 entries in Norwegian Bokmål and 124,000 in Norwegian Nynorsk.
Languages Norwegian (bokmål and nynorsk)
Rightholders Developer: EDD. A board with members from bokmålsleksikografi (Lexicography for Norwegian bokmål), Språkrådet, The Text Laboratory and EDD is responsible for operations/running and sale/marketing.
Anticipated access policy Accessible through GPL-licence, otherwise for sale.
Anticipated reuse policy Unresolved. Probably to be made accessible through Språkbanken.
Anticipated location EDD, ILN, University of Oslo
Effort needed (a) technical (b) nontechnical n/a. Currently available
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

62. Nordic Syntactic Judgment Database

Type Database. Electronic database with survey data from Nordic dialects. For Norwegian: 100 points of reference (målepunkt), 400 informants. Data from corresponding surveys/investigations in other Nordic countries provided through ScanDiaSyn.
Size Unknown. Growing
Languages Nordic dialects
Rightholders Developer: The Text Laboratory. The Norwegian material is collected in collaboration with Norwegian University of Science and Technology and University of Tromsø. The material from the other Nordic countries are supplied by the respective countries.
Anticipated access policy Accessible for research purposes.
Anticipated reuse policy Restricted
Anticipated location The Text Laboratory, ILN, University of Oslo
Effort needed (a) technical (b) nontechnical none
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

63. ScanLex-leksikon

Type Lexical database. Aligned word lists from English and six Nordic languages/language variants. Automatically generated from the parallel algined texts in the OPUS corpus.
Size Ca. 76,000 pairs of words
Languages Danish, Icelandic, Norwegian bokmål, Norwegian nynorsk, Swedish and English
Rightholders Developer: The Text Laboratory.
Anticipated access policy Public and free.
Anticipated reuse policy Unresolved. Probably to be made accessible through Språkbanken.
Anticipated location The Text Laboratory, ILN, University of Oslo
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

64. List of animate nouns

Type Word list/lexicon/glossary. List of Norwegian animate nouns, extracted from Norwegian web sites using automated Google searches.
Size 1018 nouns
Languages Norwegian
Rightholders Developer: Anders Nøklestad, The Text Laboratory.
Anticipated access policy Public and free.
Anticipated reuse policy Unresolved. Probably to be made accessible through Språkbanken.
Anticipated location The Text Laboratory, ILN, University of Oslo
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool tool

65. Anaphora resolusion system

Type Language processing tool. Tool identifying the antecedents of pronominal anaphors in Norwegian texts. The Oslo-Bergen tagger is used to pre-process the input text.
Size n/a
Languages Norwegian
Rightholders Developer: Anders Nøklestad, The Text Laboratory.
Anticipated access policy Public and free.
Anticipated reuse policy n/a
Anticipated location The Text Laboratory, ILN, University of Oslo
Effort needed (a) technical (b) nontechnical none
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool tool

66. PP Scope Disambiguator

Type Language processing tool. Tool for disambiguating PP scope. Determines whether PPs which are syntactically ambiguous with respect to scope modify a preceding noun or the main verb in the sentence. The Oslo-Bergen tagger is used to pre-process the input text.
Size n/a
Languages Norwegian
Rightholders Developer: Anders Nøklestad, The Text Laboratory.
Anticipated access policy Public and free.
Anticipated reuse policy n/a
Anticipated location The Text Laboratory, ILN, University of Oslo
Effort needed (a) technical (b) nontechnical none
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool tool

67. Named Entity Recognizer for Norwegian

Type Language processing tool. Part of the Oslo-Bergen-tagger.
Size n/a
Languages Norwegian
Rightholders Developers: The Text Laboratory and Aksis, University of Bergen (now Uni Digital)
Anticipated access policy May be downloaded for non-commercial use according to GPL conditions.
Anticipated reuse policy Special terms for use of the LISP rule interpreter.
Anticipated location The Text Laboratory, ILN, University of Oslo
Effort needed (a) technical (b) nontechnical none
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool tool

68. Named Entity Recognizer for Norwegian 2

Type Language processing tool. Named entity recognition tool (NE-recognizer). The NER classifies names using statistical methods (memory-based learning or maximum entropy modeling). The Oslo-Bergen tagger is used to pre-process input text.
Size n/a
Languages Norwegian
Rightholders Developer: Anders Nøklestad, The Text Laboratory.
Anticipated access policy Public and free
Anticipated reuse policy n/a
Anticipated location The Text Laboratory, ILN, University of Oslo / Uni Digital
Effort needed (a) technical (b) nontechnical n/a. Currently available
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool tool

69. The NoTa-tagger

Type Language processing tool. Statistics speech tagger (tree tagger) trained on material from NoTa-Oslo.
Size n/a
Languages Norwegian
Rightholders Developer: The Text Laboratory.
Anticipated access policy Public and free
Anticipated reuse policy n/a
Anticipated location The Text Laboratory, ILN, University of Oslo
Effort needed (a) technical (b) nontechnical none
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool tool

70. The Oslo-Bergen-tagger

Type Language processing tool. Morphological and syntactic CG1-tagger for Norwegian bokmål and nynorsk. Norsk ordbank is used for multi-tagging and pre-processing. The rule interpreter is implemented in Allegro Common Lisp. A later version, CG3, uses a rule interpreter from University of Southern Denmark, Odense
Size n/a
Languages Norwegian
Rightholders Developers: The tagger project (The Text Laboratory and EDD) and Aksis, University of Bergen (now Uni Digital)
Anticipated access policy May be downloaded for non-commercial use according to GPL conditions.
Anticipated reuse policy Unresolved. Linguistic rules can probably be made accessible through Språkbanken. Special terms for use of the LISP rule interpreter. CG3 rule interpreter on gpl-conditions from SDU
Anticipated location The Text Laboratory, ILN, University of Oslo / Uni Digital
Effort needed (a) technical (b) nontechnical none
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool tool

71. ScanDiaSyn Dialect Transliterator

Type Language processing tool. Semi-automatic dialect translator translating between dialect and Norwegian bokmål.
Size n/a
Languages From Norwegian dialects to bokmål
Rightholders Developer: The Text Laboratory.
Anticipated access policy Public and free.
Anticipated reuse policy n/a
Anticipated location The Text Laboratory, ILN, University of Oslo
Effort needed (a) technical (b) nontechnical none
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool tool

72. Glossa

Type Corpus search and results management system. Web-based tool facilitating writing complex search expressions, exploring result sets, creating statistics based on the result sets, and editing and storing the result sets. For different types of corpora: one language, parallel corpora and speech corpora (sound and audio)
Size n/a
Languages n/a
Rightholders Developer: The Text Laboratory.
Anticipated access policy Public and free on GPL-license
Anticipated reuse policy GPL-license
Anticipated location The Text Laboratory, ILN, University of Oslo
Effort needed (a) technical (b) nontechnical The system is operative, but needs extensions in the context of an infrastructure centre
Rationale for selection n/a
Present usage Used by several research groups in Norway and abroad
Similar resources or cooperations n/a
Data or tool tool

73. SIMPLE-editing (SIMPLE-redigering)

Type Lexicon editing system. Editing system making it possible to edit the complicated/complex SIMPLE dictionary for Norwegian.
Size n/a
Languages Norwegian
Rightholders Developers: The Text Laboratory and Lexicography for Norwegian bokmål (bokmålsleksikografi), ILN
Anticipated access policy Public and free.
Anticipated reuse policy n/a
Anticipated location The Text Laboratory, ILN, University of Oslo
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool tool

74. Nordsamisk analyser

Type tagger/analyser
Size 115,779 entries
Languages sme
Rightholders University of Tromsø
Anticipated access policy GPL
Anticipated reuse policy GPL
Anticipated location University of Tromsø
Effort needed (a) technical (b) nontechnical 2 man-years
Rationale for selection Working analysers
Present usage n/a
Similar resources or cooperations n/a
Data or tool tool

75. Lulesamisk analyser

Type tagger/analyser
Size 37,450 entries
Languages smj
Rightholders University of Tromsø
Anticipated access policy GPL
Anticipated reuse policy GPL
Anticipated location University of Tromsø
Effort needed (a) technical (b) nontechnical 2 man-years
Rationale for selection Working analysers
Present usage n/a
Similar resources or cooperations n/a
Data or tool tool

76. Sørsamisk analyser

Type tagger/analyser
Size 62,386 entries
Languages sma
Rightholders University of Tromsø
Anticipated access policy GPL
Anticipated reuse policy GPL
Anticipated location University of Tromsø
Effort needed (a) technical (b) nontechnical 2 man-years
Rationale for selection Working analysers
Present usage n/a
Similar resources or cooperations n/a
Data or tool tool

77. Færøysk analyser

Type tagger/analyser
Size 87,528 entries
Languages fao
Rightholders University of Tromsø
Anticipated access policy GPL
Anticipated reuse policy GPL
Anticipated location University of Tromsø
Effort needed (a) technical (b) nontechnical 2 man-years
Rationale for selection Working analysers
Present usage n/a
Similar resources or cooperations n/a
Data or tool tool

78. Grønlandsk analyser

Type tagger/analyser
Size 159,392 entries
Languages kal
Rightholders University of Tromsø, Oqaasillerifik
Anticipated access policy GPL
Anticipated reuse policy GPL
Anticipated location University of Tromsø
Effort needed (a) technical (b) nontechnical 2 man-years
Rationale for selection Working analysers
Present usage 400 entries/day
Similar resources or cooperations n/a
Data or tool tool

79. Nordsamisk korpus

Type corpus
Size 485,509 words
Languages sme
Rightholders University of Tromsø
Anticipated access policy GPL
Anticipated reuse policy GPL
Anticipated location University of Tromsø
Effort needed (a) technical (b) nontechnical 1 man-year
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

80. Lulesamisk korpus

Type corpus
Size 25,832 words
Languages smj
Rightholders University of Tromsø
Anticipated access policy GPL
Anticipated reuse policy GPL
Anticipated location University of Tromsø
Effort needed (a) technical (b) nontechnical 1 man-year
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

81. Sørsamisk korpus

Type corpus
Size 15,211 words
Languages sma
Rightholders University of Tromsø
Anticipated access policy GPL
Anticipated reuse policy GPL
Anticipated location University of Tromsø
Effort needed (a) technical (b) nontechnical 1 PY
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

82. Norwegian University of Science and Technology database of spoken language

Type Speech database, partly annotated
Size 8000 sound files, 13,1 GB
Languages (Mainly) Norwegian, English and Czech: Norwegian spoken by natives and non-natives (Chinese, English, German, French, Russian, Persian); English spoken by natives and non-natives; Czech spoken by natives (Norwegian and Czech)
Rightholders ISK, Norwegian University of Science and Technology
Anticipated access policy Free for research purposes
Anticipated reuse policy Free for research purposes
Anticipated location ISK, Norwegian University of Science and Technology
Effort needed (a) technical (b) nontechnical The effort to make this material available in a systematic way is hard to estimate but will be large
Rationale for selection Useful for research in phonetics and speech technology
Present usage Internal use until now
Similar resources or cooperations n/a
Data or tool data

83. LEXIN

Type Web-based dictionaries made especially for immigrants in Norway. In addition to information about parts of speech, inflection and pronunciation, the dictionary includes simple explanations and examples of everyday usage, concrete as well as metaphorical. The Norwegian LEXIN project is based on a Swedish dictionary series of the same name. In Sweden, the LEXIN dictionaries have been translated into more than 20 languages and are published both electronically and in printed versions. Since 1996 Uni Digital, commissioned by the Ministry of Education, Research and Church Affairs, the Norwegian Board of Education (from 1999), and the Norwegian Directorate for Education and Training (since 2004) has worked on developing corresponding dictionaries for Norwegian. The current inventory consists of a Bokmål dictionary, a Nynorsk dictionary, a Bokmål-Nynorsk dictionary, and 25 dictionaries from Bokmål or Nynorsk to 13 languages. 3 dictionaries currently under development.
Size 28 dictionaries
Languages Norwegian, Arabic, Kurdish (Kurmanji), Kurdish (Sorani), Persian, Polish, Russian, Somali, Tamil, Thai, Tigrinya, Turkish, Urdu, English.
Rightholders Norwegian Directorate for Education and Training (Utdanningsdirektoratet)
Anticipated access policy Restricted, unresolved
Anticipated reuse policy Unresolved
Anticipated location Uni Digital
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection The LEXIN dictionaries are developed for immigrants with little or no experience in the use of dictionaries or other linguistic resources. The dictionaries, intended for immigrants are clearly set out and easy to use, among other things because all information about an entry word is found with the word itself.
Present usage Publicly available online.
Similar resources or cooperations The Swedish and Danish LEXIN Dictionaries
Data or tool data/tool

84. Scarrie proofreading tool

Type Proofreading tool. The Norwegian part of SCARRIE aims at advanced spelling correction in Bokmål. It uses word form dictionaries in combination with special mechanisms for handling multi-word expressions and for recognizing newly seen compounds, proper names and other words not present in the dictionaries. In cooperation with Norwegian University of Science and Technology, a suitable Norwegian word form dictionary has been built. Predictable misspellings are supplied with recommendations for corrections. New compounds are detected by an analysis based on rules supplied by the University of Oslo. Words that are outside the scope of the dictionary and are likely errors are processed by the correction mechanisms including sound-based similarity. In addition, a robust grammar was developed for the detection and correction of certain classes of errors which cannot be handled at word level, i.e. agreement errors. Finally, corrections are carried out so as to fit in the written norm which the document is written in (on a range from conservative to radical Bokmål).
Size Not relevant
Languages Norwegian
Rightholders The partners of the SCARRIE-project: WordFinder Software AB (Växjö, Sweden), Universitetet i Bergen, Institutionen för lingvistik at Uppsala Universitet, Center for Sprogteknologi (København) and Svenska Dagbladet (Stockholm).
Anticipated access policy Free for research purposes. The material is restricted by property rights and cannot be used for commercial puposes without agreement.
Anticipated reuse policy Restricted
Anticipated location University of Bergen and Uni Digital
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection The main result consists of an implemented and tested prototype with enhanced capabilities for advanced error correction. It was tested on a limited test set and the results were favourable in comparison to state of the art products.
Present usage n/a
Similar resources or cooperations The Swedish and Danish Scarrie tools
Data or tool tool

85. Scarrie lexicon

Type Norwegian word form dictionary. The word forms in this list are tagged with information about lemma (basic form), standard, style or written norm, morphosyntactic characteristics and possibly replacement. The lexical information for Norwegian has been coded in several word lists. The main lexicon comprises open class words for Bokmål: adjectives, adverbs, nouns and main verbs. This dictionary contains 360,933 wordform entries organised in 72,626 lemmas (corresponding to citation forms). This means that for each citation form, on the average 5 inflected word forms are stored. Additional separate word lists have been made for closed class (grammatical) words, affixes, abbreviations and words occurring only in multi-word expressions.
Size 360,933 wordform entries, 72,626 lemmas
Languages Norwegian
Rightholders The partners of the SCARRIE-project: WordFinder Software AB (Växjö, Sweden), Universitetet i Bergen, Institutionen för lingvistik at Uppsala Universitet, Center for Sprogteknologi (København) and Svenska Dagbladet (Stockholm).
Anticipated access policy Free for research purposes. The material is restricted by property rights and cannot be used for commercial puposes without agreement.
Anticipated reuse policy Restricted
Anticipated location University of Bergen and Uni Digital
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection The Norwegian lexicons for SCARRIE have been provided with new information not available before, specifically verb subcategorization and lexical variants (style replacements). The dictionaries with inflected forms contain massive information for replacement under given styles.
Present usage n/a
Similar resources or cooperations The Swedish and Danish Scarrie lexicons
Data or tool data

86. The Norwegian Treebank Pilot Project (TREPIL)

Type Development of a suitable methodology and sophisticated tools for the semiautomatic construction of a treebank.
Size n/a
Languages Norwegian
Rightholders University of Bergen and Uni Digital
Anticipated access policy n/a
Anticipated reuse policy n/a
Anticipated location University of Bergen
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool tool

87. BREDT

Type Development of statistical methods based on existing theories and resources to automatically detect referential chains in texts.
Size n/a
Languages Norwegian
Rightholders Unknown (possibly University of Bergen)
Anticipated access policy Demonstrator downloadable from the project web page.
Anticipated reuse policy Demonstrator downloadable from the project web page.
Anticipated location University of Bergen
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool tool

88. The Text Corpus for Norwegian Nynorsk (Det nynorske tekstkorpuset)

Type Corpus
Size More than 70 M words
Languages Norwegian Nynorsk
Rightholders Norsk Ordbok 2014, ILN, Universitetet i Oslo
Anticipated access policy Public and free: available on the Internet in both tagged and untagged versions
Anticipated reuse policy The textual property rights are regulated by NO 2014
Anticipated location Department of Linguistics and Scandinavian Studies (University of Oslo)
Effort needed (a) technical (b) nontechnical 1 man-year extending the property rights of the existing material. Addition of new material will be continued throughout the NO 2014 project period, and is fully fincaced by the project.
Rationale for selection The corpus contains texts in Norwegian Nynorsk from the 1860s up to 2010. There's no restrictions on the use. The corpus material is highly important in the development of standards/norms for Norwegian language, is reusable and consists of both factual prose and fiction.
Present usage Used in the editing of Norsk Ordbok, as well as by public users online (user statistics not available).
Similar resources or cooperations None
Data or tool data

89. The Norsk Ordbok Card File Archive (Norsk Ordboks setelarkiv)

Type Database of electronic word cards
Size 3.2 M digitized word cards
Languages Norwegian Nynorsk
Rightholders Norsk Ordbok 2014, ILN, Universitetet i Oslo
Anticipated access policy Public and free: available on the Internet
Anticipated reuse policy No restrictions on use
Anticipated location Department of Linguistics and Scandinavian Studies (University of Oslo)
Effort needed (a) technical (b) nontechnical More efforts could be put into online accessibility, workload estimated to ca. 1 man-year. All technical resources will need regular maintenance/update, estimated to 1-2 man-years every 10 years. The 2014 card application is recently developed by the Department for digital documentation (Eining for digital dokumentasjon), University of Oslo
Rationale for selection The archive collocated with similar/corresponding collections in Norsk Ordboks metaordbok.
Present usage Used in the editing of Norsk Ordbok, as well as by public users online (user statistics not available).
Similar resources or cooperations Unique database, no similar existing resources for Norwegian
Data or tool data

90. The Norwegian Atlas of Dialects (Norsk Dialektatlas, kartsamlinga)

Type Collection of digital dialectal maps
Size 596 maps
Languages Norwegian Nynorsk
Rightholders Norsk Ordbok 2014, ILN, Universitetet i Oslo
Anticipated access policy Public and free: available on the Internet
Anticipated reuse policy No restrictions on use
Anticipated location Department of Linguistics and Scandinavian Studies (University of Oslo)
Effort needed (a) technical (b) nontechnical The search system is currently being developed. User systems for digital maps is still under development at an international basis. The collection of digital maps will be integrated into the existing collection of resources to give a more complete overview over Norwegian dialects. At least 2 man-years are estimated to link language data and maps.
Rationale for selection Accounts for spoken language from all parts of Norway. Contains dialect isoglosses and word geography.
Present usage Used in the editing of Norsk Ordbok, as well as by public users online (user statistics not available).
Similar resources or cooperations Unique database, no similar existing resources for Norwegian
Data or tool data

91. The Dialect Synopsis (Målføresynopsisen)

Type Database of scanned protocol pages and a rudimentary search interface
Size 43 protocols (more than 10,000 hand written protocol pages)
Languages Norwegian Nynorsk
Rightholders Norsk Ordbok 2014, ILN, Universitetet i Oslo
Anticipated access policy Public and free: available on the Internet
Anticipated reuse policy No restrictions on use
Anticipated location Department of Linguistics and Scandinavian Studies (University of Oslo)
Effort needed (a) technical (b) nontechnical The search system is currently being developed, and data is being added to the digital Atlas. Estimated effort to make the search system sufficiently accurate is approximately 1 man-year.
Rationale for selection Accounts for spoken language from all parts of Norway, mapping Norwegian dialect phonology, morphology, and partly syntactically. Point of reference for future spoken language research, both for Norwegian and other Scandinavian languages.
Present usage Used in the editing of Norsk Ordbok, as well as by public users online (user statistics not available).
Similar resources or cooperations Unique database, no similar existing resources for Norwegian
Data or tool data

92. The Dictionary Hotel (Ordbokshotellet)

Type Database of digitized versions of published collections of words including metadata from all over Norway.
Size Ca. 30 digitized collections (September 1th 2010). Another 30 collections are ready to be incorporated in the database. In total there is approximately 500 such collections of words which should be integrated in this database.
Languages Norwegian Nynorsk
Rightholders Norsk Ordbok 2014, ILN, Universitetet i Oslo
Anticipated access policy Public and free: available on the Internet
Anticipated reuse policy No restrictions on use
Anticipated location Department of Linguistics and Scandinavian Studies (University of Oslo)
Effort needed (a) technical (b) nontechnical Approximately 0,5 man-year for each collection of words (500 collections in total)
Rationale for selection Accounts for spoken language from all parts of Norway. Contributes to the overall knowledge of Norwegian dialects/spoken language.
Present usage Used in the editing of Norsk Ordbok, as well as by public users online (user statistics not available).
Similar resources or cooperations Unique database, no similar existing resources for Norwegian
Data or tool data

93. The Norsk Ordbok Meta Dictionary (Norsk Ordboks metaordbok)

Type Electronical index over Norsk Ordbok 2014's digital resources. In this index, the material is organized and made accessible by normalized entries.
Size Ca. 600,000 entries
Languages Norwegian Nynorsk
Rightholders Norsk Ordbok 2014, ILN, Universitetet i Oslo
Anticipated access policy Public and free: available on the Internet
Anticipated reuse policy No restrictions on use
Anticipated location Department of Linguistics and Scandinavian Studies (University of Oslo)
Effort needed (a) technical (b) nontechnical As of October 2010, more than 10 man-years has been put into normalizing the Meta Dictionary according to the 1938-standard. This effort requires regular and continuous maintenance of standarizing procedures. Due to the constant addition of new material, resources are allocated to this work throughout the entire project period. The costs of integrating the Meta Dictionary with the Norwegian Wordbank is not yet estimated, since this is not a part of the NO 2014 project plan and responsibilities.
Rationale for selection This index accounts for Norwegian Nynorsk after the spelling norm standard of 1938, and corresponding indexes should be established for each of the official spelling standards in order to account for the entire history of Norwegian standards. As a fully developed resource the Meta Dictionary could be integrated with the Norwegian Wordbank, increasing the current lemma inventory from 100 000 entries to approximately 600 000.
Present usage Used in the editing of Norsk Ordbok, as well as by public users online (user statistics not available).
Similar resources or cooperations Unique database, no similar existing resources for Norwegian
Data or tool data

94. The Nynorsk version of the Norwegian Word Bank (Norsk Ordbank, nynorskversjonen)

Type Electronical database of lemmas, incl. register of all inflectional forms
Size Ca. 100,000 entries
Languages Norwegian Nynorsk
Rightholders Department of Linguistics and Scandinavian Studies (University of Oslo)/The Norwegian Language Council (Språkrådet)
Anticipated access policy Available on the Internet, password-restricted access
Anticipated reuse policy Restricted (this resource is of commercial interest for product(s) requiring Norwegian according to current spelling norms. This goes for both the digitized and paper products)
Anticipated location Department of Linguistics and Scandinavian Studies (University of Oslo)
Effort needed (a) technical (b) nontechnical Consecutively being developed. Analyser tools should be applied in order to increase the number of lemmas. Upgrades estimated to approximately 0,5 man-year during 2010.
Rationale for selection Accounts for Norwegian Nynorsk according to current spelling norms. Also provides an historical overview Norwegian spelling norms.
Present usage Used in the development of standards/norms for Norwegian language
Similar resources or cooperations Unique database, no similar existing resources for Norwegian
Data or tool data

95. The Dictionary Home (Ordboksheimen)

Type Electronical database of older dialectal collections
Size Ca. 20,000 words
Languages Older Norwegian Nynorsk (not standarized)
Rightholders Norsk Ordbok 2014, ILN, Universitetet i Oslo
Anticipated access policy Public and free: available on the Internet
Anticipated reuse policy No restrictions on use
Anticipated location Department of Linguistics and Scandinavian Studies (University of Oslo)
Effort needed (a) technical (b) nontechnical Currently being developed. Approximately 10 man-years of specialized work is required to collect and digitize all existing material of older Norwegian dialect text/samples.
Rationale for selection Accounts for spoken language from all over Norway, contributes to the overall knowledge of Norwegian dialects/spoken language and their history.
Present usage Used in the editing of Norsk Ordbok, as well as by public users online (user statistics not available).
Similar resources or cooperations Unique database, no similar existing resources for Norwegian
Data or tool data

96. ADB_OD_Nor.NOR by NST

Type Acoustic database for speech recognition. Recorded for acoustic modelling for PC/Multimedia speech recognition and dictation software. The recordings were made in office environments and are based on phonetically balanced manuscripts derived from the Norwegian corpus. The database consists of a training and a testing part. The training part is used to train the acoustic model and the testing part is used to test it. One sound file contains one manuscript line, most often a sentence, in some cases a phrase or a single word. The recording script for the training data contains a dictation part and an ASR-part. The dictation part is aimed at general dictation and contains regular sentences extracted from the corpus. The first 222 units (sentences) are aimed at dictation. The last 90 units are aimed ASR and consist of person names, place names, single words, acronyms and other types of data specifically needed for training a speech recognizer. The recording script for the test database is similarly divided into a dictation and an ASR-part.
Size 312 training recordings, 987 test recordings
Languages Norwegian
Rightholders Joint ownership between University of Oslo, University of Bergen, Norwegian University of Science and Technology, The Norwegian Language Council (Språkrådet) and IBM AS
Anticipated access policy Free for research and development purposes
Anticipated reuse policy Restricted (free for research and development purposes)
Anticipated location The Norwegian Language Bank (Norsk Språkbank)
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection Part of the estate of Nordisk Språkteknologi Holding AS
Present usage Commercial use or research
Similar resources or cooperations n/a
Data or tool data

97. ADB_D_IBM-N by NST

Type Acoustic database for speech recognition. Recorded for acoustic modelling for dictation software (desktop). The recordings were made with the IBM-software ObjectRexx in the start-up phase of the cooperation between NST and IBM as a part of the training of NST-employees. The database consists of three parts recorded for the purposes of testing, training and modelling. One sound file contains one manuscript line (e.g., sentence, phrase, single word, series of digits and numbers and series of letters). This database is not validated, therefore documentation is limited.
Size 576 lines, 33,360 recordings
Languages Norwegian
Rightholders Joint ownership between University of Oslo, University of Bergen, Norwegian University of Science and Technology, The Norwegian Language Council (Språkrådet) and IBM AS
Anticipated access policy Free for research and development purposes
Anticipated reuse policy Restricted (free for research and development purposes)
Anticipated location The Norwegian Language Bank (Norsk Språkbank)
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection Part of the estate of Nordisk Språkteknologi Holding AS
Present usage Commercial use or research
Similar resources or cooperations n/a
Data or tool data

98. ADB_T_Nor.NOR by NST

Type Acoustic database for speech recognition. Contains telephone recordings over landline and mobile phones. These data are aimed at speech recognition over the telephone. The material is not divided in testing and training data. NST followed the general SpeechDat II-procedures for the recordings. The recordings were made partly with LandH-software and partly with UMS Diginform. The recordings contain 17 utterances of semi-spontaneous speech in the form of answers to questions and 40 utterances of read sentences. This database is only partly validated.
Size 3108 land line recordings and 1596 mobile phone recordings (validated).
Languages Norwegian
Rightholders Joint ownership between University of Oslo, University of Bergen, Norwegian University of Science and Technology, The Norwegian Language Council (Språkrådet) and IBM AS
Anticipated access policy Free for research and development purposes
Anticipated reuse policy Restricted (free for research and development purposes)
Anticipated location The Norwegian Language Bank (Norsk Språkbank)
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection Part of the estate of Nordisk Språkteknologi Holding AS
Present usage Commercial use or research
Similar resources or cooperations n/a
Data or tool data

99. Database with recorded hesitation sounds by NST

Type Acoustic database for speech recognition. Collected for the creation of acoustic models of hesitation sounds, i.e., non-verbal sounds produced between words, if a speaker is hesitating. This material is used for general dictation systems.
Size 50 sentences, 300 recordings
Languages Norwegian
Rightholders Joint ownership between University of Oslo, University of Bergen, Norwegian University of Science and Technology, The Norwegian Language Council (Språkrådet) and IBM AS
Anticipated access policy Free for research and development purposes
Anticipated reuse policy Restricted (free for research and development purposes)
Anticipated location The Norwegian Language Bank (Norsk Språkbank)
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection Part of the estate of Nordisk Språkteknologi Holding AS
Present usage Commercial use or research
Similar resources or cooperations n/a
Data or tool data

100. Speech Synthesis for Norwegian by NST/IBM

Type Acoustic database for speech synthesis. For the development of the IBM's speech synthesiser, professional voices were engaged for the recordings, i.e., one male voice per language. The recordings were made with IBM equipment in a sound studio in Voss, Norway but usage of this proprietary recording software does not prevent future usage of the data since the data are available in standard PCM-format. The recording manuscripts are based on the NST corpus. An optimal set of sentences was produced with IBM's OptScript software.
Size 5363 recordings
Languages Norwegian
Rightholders Joint ownership between University of Oslo, University of Bergen, Norwegian University of Science and Technology, The Norwegian Language Council (Språkrådet) and IBM AS
Anticipated access policy Free for research and development purposes
Anticipated reuse policy Restricted (free for research and development purposes)
Anticipated location The Norwegian Language Bank (Norsk Språkbank)
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection Part of the estate of Nordisk Språkteknologi Holding AS
Present usage Commercial use or research
Similar resources or cooperations n/a
Data or tool data

101. The Norwegian NST lexicon

Type Meta-lexicon complied of several resources. The lexicon was augmented with data from a Norwegian inflector program. The inflector's in-data consist of 50 000 base forms. These are identical to those in NorKompLeks – a bought resource based on Bokmålsordboka. The base forms are converted to SAMPA and manually controlled and if necessary, changed to NST's transcription conventions. The transcriptions of approx. 254 000 entries are manually controlled, while the 499 000 entries generated from the inflector are only partially controlled. All entries, except garbage terms, are annotated with information in all obligatory fields.The vocabulary is general and no special domains are represented. The lexicon consists of the 100k-list. All terms in the NST-recording manuscripts are transcribed in the lexicon. Further, the lexicon contains all entries in the Bokmålsordboka (via NorKompLeks/inflector) including conjugated forms and all terms in the SpeechDat-material. More person names, place names, company names, etc. (from e.g., Onomastica) have been added to the lexicon in later projects.
Size Total number of entries: 784,240, total number of transcriptions: 1 006 562
Languages Norwegian
Rightholders Joint ownership between University of Oslo, University of Bergen, Norwegian University of Science and Technology, The Norwegian Language Council (Språkrådet) and IBM AS
Anticipated access policy Free for research and development purposes
Anticipated reuse policy Free for research and development purposes
Anticipated location Language Technology Resource Collection for Norwegian – Språkbanken
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection Part of the estate of Nordisk Språkteknologi Holding
Present usage Commercial use or research
Similar resources or cooperations n/a
Data or tool data

102. Transcription conventions for Norwegian by NST

Type Guidelines for the transcription of the NST Norwegian lexicon and the phoneme inventory used in the Norwegian lexicon.
Size n/a
Languages n/a
Rightholders Joint ownership between University of Oslo, University of Bergen, Norwegian University of Science and Technology, The Norwegian Language Council (Språkrådet) and IBM AS
Anticipated access policy Free for research and development purposes
Anticipated reuse policy Free for research and development purposes
Anticipated location The Norwegian Language Bank (Norsk Språkbank)
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection Part of the estate of Nordisk Språkteknologi Holding AS
Present usage Commercial use or research
Similar resources or cooperations n/a
Data or tool data/tool

103. INSO (bokmål-utpakket, nynorsk-utpakket, NST)

Type Bought lexical resource, annotated with inflection, POS, morphology, compounding.
Size 71,006 base forms, 595,619 inflected forms
Languages Norwegian
Rightholders Joint ownership between University of Oslo, University of Bergen, Norwegian University of Science and Technology, The Norwegian Language Council (Språkrådet) and IBM AS
Anticipated access policy Currently not accessible
Anticipated reuse policy n/a
Anticipated location The Norwegian Language Bank (Norsk Språkbank)
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection Part of the estate of Nordisk Språkteknologi Holding AS
Present usage Commercial use or research
Similar resources or cooperations n/a
Data or tool data

104. The Norwegian Computational Lexicon (Norkompleks, NST)

Type Bought lexical resource. The Norwegian Computational Lexicon (NorKompLeks) is the result of a collaboration funded by NFR-, Telenor og Norwegian University of Science and Technology. The outcome was a computational lexicon for both of the official Norwegian languages (bokmål og nynorsk). The selection of words in the computational lexicon is primarily from Bokmålsordboka og Nynorskordboka (both from the Lexicography devision at the Department of Scandinavian Studies and Comparative Literature located at the University of Oslo). Annotated with POS, morphology, phonetic transcription.
Size 80,443 base forms, 460,777 inflected forms
Languages Norwegian
Rightholders Joint ownership between University of Oslo, University of Bergen, Norwegian University of Science and Technology, The Norwegian Language Council (Språkrådet) and IBM AS
Anticipated access policy Currently not accessible
Anticipated reuse policy n/a
Anticipated location The Norwegian Language Bank (Norsk Språkbank)
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection Part of the estate of Nordisk Språkteknologi Holding AS
Present usage Commercial use or research
Similar resources or cooperations n/a
Data or tool data

105. Onomastica (NST)

Type Bought lexical resource. The Norwegian material of a multi-language pronunciation lexicon of proper names. Annotated with POS, phonetic transcription, quality.
Size 556,499 names
Languages Norwegian
Rightholders Joint ownership between University of Oslo, University of Bergen, Norwegian University of Science and Technology, The Norwegian Language Council (Språkrådet) and IBM AS
Anticipated access policy Currently not accessible
Anticipated reuse policy n/a
Anticipated location The Norwegian Language Bank (Norsk Språkbank)
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection Part of the estate of Nordisk Språkteknologi Holding AS
Present usage Commercial use or research
Similar resources or cooperations n/a
Data or tool data

106. Statistisk sentralbyrå (NST)

Type Bought lexical resource. Pronunciation database of proper names. Annotated with frequency, POS.
Size 71,795 names
Languages Norwegian
Rightholders Joint ownership between University of Oslo, University of Bergen, Norwegian University of Science and Technology, The Norwegian Language Council (Språkrådet) and IBM AS
Anticipated access policy Currently not accessible
Anticipated reuse policy n/a
Anticipated location The Norwegian Language Bank (Norsk Språkbank)
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection Part of the estate of Nordisk Språkteknologi Holding AS
Present usage Commercial use or research
Similar resources or cooperations n/a
Data or tool data

107. Bronnoy_navn

Type Bought lexical resource. Pronunciation database of proper names. Annotated with POS.
Size 1,019,643 names
Languages Norwegian
Rightholders Joint ownership between University of Oslo, University of Bergen, Norwegian University of Science and Technology, The Norwegian Language Council (Språkrådet) and IBM AS
Anticipated access policy Currently not available
Anticipated reuse policy n/a
Anticipated location The Norwegian Language Bank (Norsk Språkbank)
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection Part of the estate of Nordisk Språkteknologi Holding AS
Present usage Commercial use or research
Similar resources or cooperations n/a
Data or tool data

108. Bokmål corpus by NST

Type Text corpus. The Bokmål Corpus was to some extent cleaned up before the development of manuscript sentences and lexical data took place. The resulting corpus consists of text files with approx. 735 M words, while the complete corpus consists of approx. 975 words. The clean-up consisted of conversions from proprietary formats to text files, removal of duplicates, and removal of unusable files (e.g., tiff-files, QuarkXpresss, FrameMaker, etc.). The work is limited to conversions into text format. The texts contain all text in the original documents. Anonymisation of correspondence was not done - something which would be necessary for general distribution and usage of the material. The material lacks tagging of linguistic information (POS, lemma, etc.). In some of the material, structure is marked (paragraphs, headings, etc.). This was done by the supplier and does not follow a standard defined for the corpus project (or any other xml/sgml-standard.) NST started to code the corpus in XML. Some of the ITAvisen-texts are coded with structural information. 3.8 M words (0,4% of the complete material) were coded in this way. Only texts from ITAvisen were coded and no corresponding Nynorsk material is available.
Size Ca. 975 M words
Languages Norwegian
Rightholders Joint ownership between University of Oslo, University of Bergen, Norwegian University of Science and Technology, The Norwegian Language Council (Språkrådet) and IBM AS
Anticipated access policy Currently not available
Anticipated reuse policy n/a
Anticipated location The Norwegian Language Bank (Norsk Språkbank)
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection Part of the estate of Nordisk Språkteknologi Holding AS
Present usage Commercial use or research
Similar resources or cooperations n/a
Data or tool data

109. Nynorsk corpus by NST

Type Text corpus. Raw data from very few sources (mostly internet texts, very small files)
Size n/a
Languages Norwegian
Rightholders Joint ownership between University of Oslo, University of Bergen, Norwegian University of Science and Technology, The Norwegian Language Council (Språkrådet) and IBM AS
Anticipated access policy Currently not available
Anticipated reuse policy n/a
Anticipated location The Norwegian Language Bank (Norsk Språkbank)
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection Part of the estate of Nordisk Språkteknologi Holding AS
Present usage Commercial use or research
Similar resources or cooperations n/a
Data or tool data

110. Prehistoric Artifacts, Sites and Monuments in Western Norway

Type Archaeology: A comprehensive overview of historical sites, monuments and artifacts in the 78 municipalities belonging to the region served by the Bergen Museum.
Size n/a
Languages Norwegian
Rightholders Bergen museum, UiB
Anticipated access policy Public and free
Anticipated reuse policy Public and free
Anticipated location n/a
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

111. The Main Catalogue of the Artefact Collection (Oldsaksamlingen)

Type Archaeology: The main catalogue of acquisitions for Oldsaksamlingen in Oslo describes all the artifacts that have arrived at the museum. All printed annual acquisition records are now available.
Size n/a
Languages Norwegian
Rightholders Etnografisk museum, University of Oslo
Anticipated access policy Public and free
Anticipated reuse policy Public and free
Anticipated location n/a
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

112. The Archeaeology Database

Type Archaeology: A prototype of the various museum databases that are being developed. This database makes it possible to search in the acquisition catalogue of Oldsaksamlingen (see above) and among artifacts and recorded sites and monuments found in the Marum area in the Sandefjord municipality.
Size n/a
Languages Norwegian
Rightholders Bergen museum, UiB
Anticipated access policy Public and free
Anticipated reuse policy Public and free
Anticipated location n/a
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

113. The Bergen Museum Main Inventory Catalogue

Type Archaeology, converted archive not yet available
Size n/a
Languages Norwegian
Rightholders Bergen museum, UiB
Anticipated access policy Public and free
Anticipated reuse policy Public and free
Anticipated location n/a
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

114. The Bergen Museum Topographical Archives

Type Archaeology, converted archive not yet available
Size n/a
Languages Norwegian
Rightholders Bergen museum, UiB
Anticipated access policy Public and free
Anticipated reuse policy Public and free
Anticipated location n/a
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

115. The Tromsø Museum Main Inventory Catalogue

Type Archaeology, converted archive not yet available
Size n/a
Languages Norwegian
Rightholders UiTø
Anticipated access policy Public and free
Anticipated reuse policy Public and free
Anticipated location n/a
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

116. Lists of Photographs, Tromsø Museum

Type Archaeology, converted archive not yet available
Size n/a
Languages Norwegian
Rightholders UiTø
Anticipated access policy Public and free
Anticipated reuse policy Public and free
Anticipated location n/a
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

117. The Main Inventory Catalogue of the Museum of Science and Natural History, Norwegian University of Science and Technology

Type Archaeology, converted archive not yet available
Size n/a
Languages Norwegian
Rightholders Museum of Science and Natural History, Norwegian University of Science and Technology
Anticipated access policy Public and free
Anticipated reuse policy Public and free
Anticipated location n/a
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

118. The Bokmål Dictionary (Bokmålsordboka)

Type Lexicography, electronic dictionaries: The Section for Norwegian Lexicography and at the Department for Scandinavian Languages and Comparative Literature at the University of Oslo has collaborated with the Norwegian Universities' Documentation Project to offer a simplified version of the most recent edition of the Bokmål Dictionary on the Internet.
Size n/a
Languages Norwegian
Rightholders ILN/University of Oslo and Språkrådet
Anticipated access policy Public and free
Anticipated reuse policy Restricted
Anticipated location n/a
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

119. The Nynorsk Dictionary (Nynorskordboka)

Type Lexicography, electronic dictionaries: The Section for Norwegian Lexicography at the Department of Scandinavian Languages and Comparative Literature at the University of Oslo has collaborated with the Norwegian Universities' Documentation Project with the aim of offering a simplified version of the most recent edition of the Nynorsk Dictionary on the Internet.
Size n/a
Languages Norwegian
Rightholders ILN/University of Oslo and Språkrådet
Anticipated access policy Public and free
Anticipated reuse policy Restricted
Anticipated location n/a
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

120. The Lexical Source Manuscript (Grunnmanuskriptet)

Type Lexicography, electronic dictionaries: Grunnmanuskriptet is an old dictionary manuscript from the 1930s (approximately 13500 typewritten A4 pages). The entries were collected from the dictionaries by Aasen, Ross, Schjøtt, Vidsteen, Torp and others, but the definitions for the entries are all given in Nynorsk. The manuscript was never published as a dictionary, but has provided the basis for the development of the Norwegian Dictionary (Norsk Ordbok). This manuscript has now been recorded as electronic text and marked in a way that enables the reader to find information such as dialects with their location, etymology, quotations and sources.
Size Ca. 13,500 typewritten A4 pages
Languages Norwegian
Rightholders Section for Norwegian Lexicography and Dialectology in the Department of Scandinavian Languages and Comparative Literature, University of Oslo.
Anticipated access policy Public and free
Anticipated reuse policy Public and free
Anticipated location n/a
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

121. Literary texts, Bokmål

Type Texts: Through the Documentation Project, a large part of the Bokmål text material found in the Section for Norwegian Lexicography and Dialectology in the Department for Scandinavian Languages and Comparative Literature at the University of Oslo was digitized. The material in the archives consists of several card files with excerpts from Norwegian literature. Instead of digitizing the material in the author archives, the complete texts were scanned in. The scanned material includes texts dating from (approximately)1550 to 1900, but most of it is from the 1800s. A total of 60000 book pages were scanned in.
Size 60,000 pages
Languages Norwegian
Rightholders Section for Norwegian Lexicography and Dialectology in the Department of Scandinavian Languages and Comparative Literature, University of Oslo.
Anticipated access policy Public and free
Anticipated reuse policy Public and free
Anticipated location n/a
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

122. Older collections of words

Type Texts: The Nynorsk section of the project has scanned in a selection of 34 older collections of words. Five of these have been given key words according to the 1938 standard: Norderhov (1698), Robyggjelaget (end of the 1600s), "Den Norske Dictionarium" (printed in Copenhagen in 1646), Stavanger (1698), and Bø in Vesterålen (1698). The remaining 29 will eventually be made accessible for free text searches.
Size n/a
Languages Norwegian
Rightholders Section for Norwegian Lexicography and Dialectology in the Department of Scandinavian Languages and Comparative Literature, University of Oslo.
Anticipated access policy Public and free
Anticipated reuse policy Public and free
Anticipated location n/a
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

123. Texts in Nynorsk

Type Texts: The scanned material from the Nynorsk project section is made up primarily of texts of which only fragments are given in the card file. These texts will therefore supplement the card-file database. The material includes a number of complete literary texts by various authors writing in Nynorsk, the 1921 edition of the Bible in Nynorsk, a selection of books from the Norwegian Folklore Association series (NFL), and some complete annual volumes of "Syn og Segn", a Nynorsk literary journal. In addition, a selection of older collections of words (see above) have been scanned in. The scanned material will provide the basis for a larger body of texts which will be made available for free text searches. These texts are not accessible at this stage.
Size n/a
Languages Norwegian
Rightholders Section for Norwegian Lexicography and Dialectology in the Department of Scandinavian Languages and Comparative Literature, University of Oslo.
Anticipated access policy Public and free
Anticipated reuse policy Public and free
Anticipated location n/a
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

124. The Norwegian Dictionary (Norsk Ordbok)

Type Word Archives: The card file consists of approximately three million entries with key words organized alphabetically in accordance with the 1938 standard. Each entry contains excerpts from Nynorsk literature, journals and newspapers. In addition, information about dialects is provided by native speakers all around the country. A facsimile has been made of each entry, allowing the electronic retrieval of its image. Basic information related to each entry, for example the key word (the word or phrase defined or illustrated by the entry - when standardized according to the 1938 standard), the grammar (which part of speech the key word corresponds to) and the source (the source of the information on the entry).
Size Ca. 3 M entries
Languages Norwegian
Rightholders Section for Norwegian Lexicography and Dialectology in the Department of Scandinavian Languages and Comparative Literature, University of Oslo.
Anticipated access policy Public and free
Anticipated reuse policy Public and free
Anticipated location n/a
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

125. The Norwegian Dictionary (Norsk Ordbok) - additions after 1990

Type Word Archives: After the Documentation project was launched, new entries have been recorded in the card file. These new entries eventually will be incorporated in the card-file database, but, in the mean time, they are stored in a separate database for new acquisitions.
Size n/a
Languages Norwegian
Rightholders Section for Norwegian Lexicography and Dialectology in the Department of Scandinavian Languages and Comparative Literature, University of Oslo.
Anticipated access policy Public and free
Anticipated reuse policy Public and free
Anticipated location n/a
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

126. The Dictionary of Dialect Words from the Trøndelag Region (Trønderordboka)

Type Word Archives: In 1981 a project was initiated at the Department of Scandinavian Studies and Comparative Literature, Norwegian University of Science and Technology (Norwegian University of Science and Technology), Trondheim, which was aimed at compiling a dictionary of the Trøndelag dialects. A total of approximately 180000 entries has been collected giving examples of the variations of the Trøndelag dialect. The examples have been collected from literature and from the spoken language. The entries are of the same type as those contributing to the Norwegian Dictionary (Norsk Ordbok). They consist of a key word, information about or examples of how the word is used, and information about the source of the recorded information.
Size Ca. 180,000 entries
Languages Norwegian
Rightholders Department of Scandinavian Studies and Comparative Literature, Norwegian University of Science and Technology
Anticipated access policy Public and free
Anticipated reuse policy Public and free
Anticipated location n/a
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

127. The New Words Database

Type Word Archives: The New Words Archives at the Section for Norwegian Lexicography and Dialectology in the Department for Scandinavian Languages and Comparative Literature at the University of Oslo, contains quotations from newspapers, journals and magazines. There are approximately 300000 quotes from 174 different sources. The compilation of this database has been going on for several decades, so the term "new word" should be interpreted in an historical perspective. The word was new, or had acquired a new meaning, or had come to be used in a different way at the time of registration. This edition of the New Words Database consists of 116005 quotes, the majority of which are from the years 1968 to 1972. However, the oldest entries are from 1920 while the most recent ones are from 1994. The number of quotations will gradually increase as more of the material is processed. In each citation one or more of the words are selected (excerpted) and dealt with (this edition offers a total of 195744 excerpts). The excerpted words are transformed to their basic form and provided with a code for grammatical function (part of speech) and other relevant codes. Simple words (including derivatives and composites) are marked with a single code for grammatical function while composite words are given codes for both word elements although it is the last element that determines the grammatical function of the composite word. One hundred four different auxiliary codes provide information about morphology, phraseology, imagery, etc.
Size 116,005 quotes
Languages Norwegian
Rightholders Section for Norwegian Lexicography and Dialectology in the Department of Scandinavian Languages and Comparative Literature, University of Oslo.
Anticipated access policy Public and free
Anticipated reuse policy Public and free
Anticipated location n/a
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

128. Allex - African Languages Lexical Project

Type Word Archives: Text Corpora, Sound Corpora, Parsers, Dictionaries for Zimbabwe languages
Size n/a
Languages Shona, Ndebele, Nambya
Rightholders Department of Linguistics and Scandinavian Studies (University of Oslo) African Languages Research Institute, University of Zimbabwe, Zimbabwe, Unit for Digital Documentation (University of Oslo)
Anticipated access policy Public and free
Anticipated reuse policy Public and free
Anticipated location n/a
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

129. Ballads

Type Folklore Studies: In its work with the ballad material, the Documentation Project has aimed to make a scholarly edition of Norwegian ballads in electronic format which faithfully reproduces the original. The original material, housed in several different archives, includes original manuscripts of ballad texts, old notations of the tunes and audio recordings of old and more recent renditions of the ballads. This collection represents 240 types of Norwegian ballads. Approximately 3900 different varieties have been digitized in this project. The material is extensive and varied. Some of the digitized texts are accompanied by notations of the tunes and audio recordings.
Size 240 types of Norwegian ballads, approximately 3900 different varieties have been digitized
Languages Norwegian
Rightholders Department of Cultural Studies, University of Oslo and Norwegian Ballad Archives
Anticipated access policy Public and free
Anticipated reuse policy Public and free
Anticipated location n/a
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

130. Court proceedings and protocols

Type Digitized records of 1600s and 1700s court proceedings at the lowest courts, where a public registrar or a notary public presided as judge. The records include accounts of disagreements over debts, conflicts between neighbors, broken marriage vows as well as serious criminal acts. The Gothic handwriting is different from printed Gothic text as well as from modern handwriting. The language used in the protocols is unfamiliar, being influenced formal, official styles of Danish and German. Furthermore, the texts include a large number of peculiar symbols and abbreviations. The oldest preserved Norwegian court protocols are from Rogaland in southwestern Norway (Jæren and Dalane, 1613, and Ryfylke, 1616) and Finnmark (1620) in the far north. In 1633 a royal decree ordered the recording of court protocols at the lower courts in Norway. Nonetheless, we have only a few from the first half of the 1600s. However, after the introduction of absolute monarchy in 1660, compliance with this decree seems to have been the rule though many of the records from this period have been lost. After 1700 the court protocols were systematically stored in well-organized volumes but, even so, there are occasional gaps.
Size n/a
Languages Norwegian
Rightholders Department of History, University of Oslo
Anticipated access policy Public and free
Anticipated reuse policy Public and free
Anticipated location n/a
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

131. Diplomatarium Norvegicum

Type Medival Material: Diplomatarium Norvegicum is a series of text sources which give a verbatim and linguistically faithful reading of documents older than 1570. It is now, in 1998, 150 years since the first volume was published; the first of a total of 22 volumes which include approximately 19000 documents. As the foremost example of Norwegian source editions, Diplomatarium Norvegicum is the principal source for anyone working with medieval text material. A facsimile of a diploma from 1224. The black and white facsimile is of moderately good quality. You can find the digital text by searching in the database.
Size n/a
Languages Norwegian
Rightholders Old Norse Dictionary Unit (Gammalnorsk Ordboksverk), University of Oslo
Anticipated access policy Public and free
Anticipated reuse policy Public and free
Anticipated location n/a
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

132. Henrik Wergeland - collected works

Type Literature: The texts in this web edition equal those found in the 23 volume edition edited by Herman Jæger, Didrik Arup Seip, Halvdan Koht and Einar Høigård, published by Steenske Forlag, Kristiana/Oslo 1918-40. As part of the Bokmål project, the texts were scanned and extensively coded in SGML. The web presentation (as html documents) is generated automatically from this coding. This makes the graphical layout of some of the pages look strange. We will continue to enhance the typgraphical quality. When printing these texts, one should be aware that some printers divide html documents somewhat on random. We have planned to make PDF versions of all the texts. These can be read by the program "ACROBAT READER" (which is published along with web browsers). The PDF format is better suited to maintain typography at print.
Size n/a
Languages Norwegian
Rightholders n/a
Anticipated access policy Public and free
Anticipated reuse policy Public and free
Anticipated location n/a
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

133. Norwegian Farm Names

Type Place names: O. Rygh’s collection, "Norwegian Farm Names" (Norske Gaardsnavne) consists of 18 volumes, one for each of the Norwegian counties. It contains information about all Norwegian farms and some of their subunits, amounting to a total of 55000 entries. The names are organized according to districts and by consecutive, increasing farm registration numbers. The electronic version of this collection ("Elektroniske Norske Gaardsnavne") is being created with support from the Norwegian Research Council and from the following counties: Østfold, Vestfold, Akershus, Rogaland, Hordaland, Møre og Romsdal, Sogn og Fjordane, Sør-Trøndelag and Nord-Trøndelag. More information on "Norske Gaardsnavne".
Size 55,000 entries
Languages Norwegian
Rightholders Section for Place Name Studies, Department of Scandinavian Studies and Comparative Literature, University of Oslo
Anticipated access policy Public and free
Anticipated reuse policy Public and free
Anticipated location n/a
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

134. The Land Register Draft of 1950 (Matrikkelutkastet frå 1950)

Type Place names: The Norwegian Finance Department has compiled approximately 85000 lists of real estate in Norway, organized by consecutive, increasing farm numbers within each municipality. The revision of the land register was never completed, and since Finnmark county is not included in the lists, they are referred to as a draft. In addition to farm names, the draft includes the names of private homes, vacation homes, lots, public and private institutions, etc. The land register draft is an important tool for the State Name-Consultancy Service (Statens navnekonsulenttjeneste) in its efforts to standardize place names. Since it also includes the names of the owners of all the listed properties, it can also be of use to researchers studying names of people. In fact, it is often the only accessible comprehensive source of names for newer properties.
Size Approximately 85,000 lists of real estate
Languages Norwegian
Rightholders Section for Place Name Studies, Department of Scandinavian Studies and Comparative Literature, University of Oslo
Anticipated access policy Public and free
Anticipated reuse policy Public and free
Anticipated location n/a
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

135. The Home Name Register

Type Place names: This register, found in the Section for the Study of Names at the University of Oslo, includes the names of homes (farms, their subunits and summer pastures) from ten of Norway’s counties. The names are organized by consecutive, increasing farm numbers within each district. The register includes 109000 archive cards which provide information on the spelling of the name, its pronunciation, correct preposition and dative form of the name, older versions of the spelling, and variations in spelling and pronunciation. They will often include comments on topography, peculiarities of dialect, and the interpretation of the name.
Size 109,000 archive cards
Languages Norwegian
Rightholders Section for Place Name Studies, Department of Scandinavian Studies and Comparative Literature, University of Oslo
Anticipated access policy Public and free
Anticipated reuse policy Public and free
Anticipated location n/a
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

136. The Place Name Archives at the University of Tromsø

Type Place names: The North Norway Archives of Dialects at the Department of Language and Literature, University of Tromsø, includes records of Norwegian place names as well as collections of Saami and Finnish place names. A significant part of the material was collected locally in recent times, but it also includes copies of older collections from other institutions. The size of the collection is estimated at one million names. In the process of selecting material for digital conversion, certain guidelines have been developed for determining priorities.
Size Ca. 1 M names
Languages Norwegian
Rightholders Department of Language and Literature, University of Tromsø
Anticipated access policy Public and free
Anticipated reuse policy Public and free
Anticipated location n/a
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

137. Audio Recordings at the University of Tromsø

Type Place names
Size n/a
Languages Norwegian
Rightholders Department of Language and Literature, University of Tromsø
Anticipated access policy Public and free
Anticipated reuse policy Public and free
Anticipated location n/a
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

138. Writings by School Children from Nordland and Troms Counties (the Indrebø material)

Type Place names: This database includes approximately 100000 names of places in North Norway which were found in Indrebø’s collection of writings by school children.
Size Ca. 100,000 names
Languages Norwegian
Rightholders Department of Language and Literature, University of Tromsø
Anticipated access policy Public and free
Anticipated reuse policy Public and free
Anticipated location n/a
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

139. Literature Lists, the etymology register in Uppsala, Sweden

Type Place names: The largest collection of place names in the Nordic countries is found in the Ortnams Archives in Uppsala. It consists of approximately 240000 archive cards, a number which is constantly increasing as new material is being excerpted. The register includes Nordic names and name elements with literature references. The archives are not yet accessible by the public. While the material is being digitized, a separate list is being compiled over all the literature referred to in the archive.
Size Ca. 240,000 archive cards
Languages Swedish
Rightholders Etymology registry in Uppsala, Sweden, and Section for Name Research, Department of Scandinavian Studies and Comparative Literature, University of Oslo
Anticipated access policy Public and free
Anticipated reuse policy Public and free
Anticipated location n/a
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

140. Multi-tagger

Type The multi-tagger is a part of the Oslo-Bergen-tagger and is based on word lists from Norsk Ordbank. The multi-tagger performs morphological analysis, compound analysis and multi-word expression detection.
Size n/a
Languages Norwegian
Rightholders Developers: The tagger project (The Text Laboratory and EDD) and Aksis, University of Bergen (now Uni Digital)
Anticipated access policy May be downloaded for non-commercial use according to GPL conditions.
Anticipated reuse policy May be downloaded for non-commercial use according to GPL conditions.
Anticipated location The Text Laboratory, ILN, University of Oslo / Uni Digital
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool tool

141. NorGram

Type LFG grammar which was developed in the Norwegian part of ParGram.
Size n/a
Languages Norwegian
Rightholders LLE, University of Bergen
Anticipated access policy Public and free, licence. Grammar is open source and free of charge, and the demo version is available at http://decentius.aksis.uib.no/logon/xle.xml. However, the parser requires the XLE (Xerox Linguistic Environment) from PARC, for which users need to sign a license. XLE is free of charge but without source code and with strong restrictions on usage.
Anticipated reuse policy Restricted
Anticipated location University of Bergen
Effort needed (a) technical (b) nontechnical Will be continually updated in the INESS project
Rationale for selection Reusable computational grammar for Norwegian Bokmål and Nynorsk with a broad empirical coverage and a healty theoretical foundation.
Present usage Used within TREPIL.
Similar resources or cooperations NorGram is affiliated with the the Parallel Grammar Project (ParGram), an international cooperative effort to develop parallel LFG grammars for English, French, German, Norwegian, Japanese and Urdu.
Data or tool tool

142. Norwegian Syntax-based Grammar (Norsyg)

Type HPSG grammar for Norwegian. A continuation of earlier grammars: NorSource, Saargram, Phdgram. The initial grammar was based on the Grammar Matrix version 0.6. The implementation platform is the LKB system. Problem is limited coverage and robustness (provides output for roughly 50% of the input sentences).
Size n/a
Languages Norwegian
Rightholders Petter Haugereid, Norwegian University of Science and Technology
Anticipated access policy Free for research, LGPL. The dictionary can be downloaded from Norsk Ordbank's site at University of Oslo.
Anticipated reuse policy Free for research, LPGL
Anticipated location Norwegian University of Science and Technology
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection There are two important assumtions made in Norsyg that distinguishes it from other implemented grammars. First, the linking between the syntax and the mantics is done in the syntax, rather than in the lexicon. And second, the topic is realized at the bottom of the tree, and not at the top.
Present usage n/a
Similar resources or cooperations NorGram
Data or tool tool

143. Shallow PARsing of TAgged Norwegian Nouns (Spartan)

Type Parser. A package of Perl scripts for extracting dependency relations between nouns (from text). Requires input tagged with Oslo-Bergen-tagger.
Size n/a
Languages Norwegian
Rightholders Erik Velldal, University of Oslo
Anticipated access policy Public and free
Anticipated reuse policy Public and free
Anticipated location University of Oslo
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool tool

144. NorKompLeks

Type Computational lexicon for both of the official Norwegian languages (bokmål og nynorsk). The selection of words in the computational lexicon is primarily from Bokmålsordboka og Nynorskordboka. The monolingual dictionary also provides arument structure for verbs.
Size n/a
Languages Norwegian
Rightholders The Department of Language and Commucation Studies at the Norwegian University of Science and Technology
Anticipated access policy Available both for research and commercial use.
Anticipated reuse policy Available both for research and commercial use.
Anticipated location Norwegian University of Science and Technology
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

145. Stor Ordbok (the big dictionary)

Type Electronic dictionary. Digital version of the most comprehensive Norwegian-English and English-Norwegian dictionary in print.
Size 217,000 entries and multi word expressions, and 522 000 translations.
Languages Norwegian, English
Rightholders Probably Kunnskapsforlaget
Anticipated access policy Restricted. Requires permission from publisher. Available through Internet subscription.
Anticipated reuse policy Restricted. Requires permission from publisher.
Anticipated location Unknown (possibly Kunnskapsforlaget)
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection Lists both British and American word forms and spelling standards. Covers both everyday language and technical terms from special fields.
Present usage Has been used in several research projects.
Similar resources or cooperations Same publisher also has Norwegian-German and Norwegian-Italian.
Data or tool data

146. TriTrans

Type Online multi-language dictionary. Plain words only.
Size Ca. 22,000 Norwegian words
Languages Norwegian, English, Spanish.
Rightholders n/a
Anticipated access policy Free of charge
Anticipated reuse policy Free of charge
Anticipated location n/a
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

147. Websters online dictionary

Type Online dictionary, Norwegian-English and English-Norwegian.
Size n/a
Languages Norwegian, English
Rightholders Websters
Anticipated access policy Crawling not allowed, but possibly free for research after permission from owners.
Anticipated reuse policy Crawling not allowed, but possibly free for research after permission from owners.
Anticipated location n/a
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

148. Clue dictionaries

Type Presumably the largest electronic dictionaries for Norwegian, Norwegian-English-Norwegian and Norwegian-German-Norwegian.
Size n/a
Languages Norwegian, English, German.
Rightholders Clue Norge ASA
Anticipated access policy Commercial purchase (ca. 700 euros)
Anticipated reuse policy Commercial purchase (ca. 700 euros)
Anticipated location Clue Norge ASA
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage Commercial
Similar resources or cooperations n/a
Data or tool data

149. dicts.info

Type Four sets of dictionaries in a multitude of languages: Universal dictionary, Wiktionary, Omegawiki, and Wikipedia. All four sets include all the PRESEMT languages: Norwegian, Italian, German, Czech, Greek and English.
Size n/a
Languages Norwegian, Italian, German, Czech, Greek and English.
Rightholders dicts.info
Anticipated access policy Free
Anticipated reuse policy Free
Anticipated location Unknown (possibly dicts.info)
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

150. RuN corpus of parallel Norwegian-Russian-English-Serbian texts

Type A parallel Norwegian-Russian-English-Serbian corpus, partly based on the existing Oslo Multilingual Corpus, is developed in the RuN project with technical assistance from the Text Laboratory, University of Oslo. The corpus will provide a basis for contrastive studies, including the study of grammatical phenomena such as Russian aspect, information structure, (in)definiteness, bare nominals and tense/mood in Russian and Norwegian from the perspective of both native speakers and second language learners.
Size n/a
Languages Norwegian, Russian, English, Serbian
Rightholders ILOS, University of Oslo
Anticipated access policy Restricted
Anticipated reuse policy Restricted
Anticipated location The Text Laboratory, ILN, University of Oslo)
Effort needed (a) technical (b) nontechnical The resource is currently being developed, and the project is fully financed. The project receives funding from the Norwegian Centre for International Cooperation in Higher Education (SIU) through its Cooperation Programme with Russia.
Rationale for selection The developers believe that focus on contrastive linguistics and translation studies can bridge the gap between research and education in the field of advanced second language learning of Russian and Norwegian.
Present usage The RuN project has established an educational and research oriented environment for graduate students and scholars from Russia (notably Murmansk Humanities Institute) and the University of Oslo working on languages in contrast (Russian vs. Norwegian and/or English)
Similar resources or cooperations n/a
Data or tool data

151. Stockholm MULtilingual TReebank (SMULTRON)

Type A parallel treebank first developed by the Computational Linguistics Group at the Department of Linguistics, at Stockholm University. Contains aligned syntactic trees for (among others) Norwegian, English and German. Version 1.0 contains around 1000 sentences in English, German and Swedish. The sentences have been PoS-tagged and annotated with phrase structure trees. The trees have been aligned on sentence, phrase and word level. Additionally, the German and Swedish monolingual treebanks contain lemma information. The Institute of Computational Linguistics continues the work on the SMULTRON project. Version 2.0 is an extension of the original treebank with a new text type: 500 sentences from a user manual in English, German, Swedish and Spanish. Currently SMULTRON treebanks with around 1500 sentences (version 2.0) in TIGER-XML format in 9 treebank files (Spanish not yet included) plus 8 alignment files are being distributed.
Size 1500 sentences
Languages Several languages including Norwegian.
Rightholders The Computational Linguistics Group at the Department of Linguistics, at Stockholm University.
Anticipated access policy Free of charge for research purposes. Registered users only (name, affiliation, and email address).
Anticipated reuse policy Free of charge for research purposes. Registered users only (name, affiliation, and email address).
Anticipated location The Computational Linguistics Group at the Department of Linguistics, at Stockholm University.
Effort needed (a) technical (b) nontechnical There are plans to extend the treebank with new types and texts and more languages.
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

152. OPUS project corpora

Type Parallel text corpora for many different language pairs. OPUS is a growing collection of translated texts from the web. The OPUS project converts and aligns free online data, adds linguistic annotation, and provides the community with a publicly available parallel corpus. OPUS is based on open source products and the corpus is also delivered as an open content package. Several tools are used to compile the current collection. All pre-processing is done automatically. No manual corrections have been carried out.
Size n/a
Languages Several languages including Norwegian.
Rightholders Department of Linguistics and Philology, Uppsala University
Anticipated access policy Free
Anticipated reuse policy Free
Anticipated location Unknown (possibly Department of Linguistics and Philology, Uppsala University)
Effort needed (a) technical (b) nontechnical n/a
Rationale for selection The main motivation for compiling OPUS is to provide an open source parallel corpus that uses standard encoding formats including linguistic annotation. A public collection of parallel corpora that can freely be used and distributed makes it possible for everyone to run experiments on bitexts and their results can easily be compared.
Present usage n/a
Similar resources or cooperations n/a
Data or tool data

153. Audio files with Norwegian dialects

Type Audio recordings of (older) dialects in Norway and from Norwegians in America (Seip and Selmer 1931, Einar Haugen 1935-1948, Arnstein Hjelde 1987, Joseph Salmons, University of Wisconsin 2009 and Janne Bondi Johannessen and Signe Laake 2010 (video recordings)). The files are digitalized, but not transcribed.
Size n/a
Languages Norwegian dialects
Rightholders n/a
Anticipated access policy Restricted: access only for research and development purposes.
Anticipated reuse policy Restricted.
Anticipated location The Text Laboratory, ILN, University of Oslo
Effort needed (a) technical (b) nontechnical The audio files need to be transcribed and made searchable by Glossa
Rationale for selection n/a
Present usage n/a
Similar resources or cooperations Dialect archives at other institutions universities in Norway:
Data or tool data

154. Cadasters for Bergen 1686 and 1673

Type Text corpora. Cadastres (grunnbøker) of Bergen city, years 1686 and 1673. Available through web interface, WebGIS and as PDF. Partly indexed on place names, addresses and person names.
Size n/a
Languages Danish and Norwegian
Rightholders Arne Solli and Geir Atle Ersland, AHKR, University of Bergen
Anticipated access policy Open, http://gandalf.aksis.uib.no/bergis/GBB1686.page
Anticipated reuse policy Restricted.
Anticipated location University of Bergen
Effort needed (a) technical (b) nontechnical none
Rationale for selection historical research
Present usage n/a
Similar resources or cooperations n/a
Data or tool data