- Apr 30: Long and short paper submission deadline
- May 14: Acceptance notification
- May 21: Camera ready
- Jul 19: Workshop
Shared Task Dates
- Feb 8: Training data release
- Mar 23: Test phase starts
Apr 6Apr 19: Test phase ends (deadline extended)
- May 4: System description paper
- May 14: Author feedback
- May 21: Camera ready
Code-switching (CS) is the phenomenon by which multilingual speakers switch back and forth between their common languages in written or spoken communication. CS is typically present on the intersentential, intrasentential (mixing of words from multiple languages in the same utterance) and even morphological (mixing of morphemes) levels. CS presents serious challenges for language technologies such as Parsing, Machine Translation (MT), Automatic Speech Recognition (ASR), information retrieval (IR) and extraction (IE), and semantic processing. Traditional techniques trained for one language quickly break down when there is input mixed in from another. Even for problems that are considered solved, such as language identification, or part of speech tagging, performance degrades at a rate proportional to the amount and level of the mixed-language present.
This workshop aims to bring together researchers interested in solving the problem and increase community awareness of the possible viable solutions to reduce the complexity of the phenomenon. The workshop invites contributions from researchers working in NLP approaches for the analysis and processing of mixed-language data especially with a focus on intrasentential code-switching. Topics of relevance to the workshop will include the following:
- Development of linguistic resources to support research on code-switched data
- NLP approaches for language identification in code-switched data
- NLP approaches for named entity recognition in code-switched data
- NLP techniques for the syntactic analysis of code-switched data
- Domain/dialect/genre adaptation techniques applied to code-switched data processing
- Language modeling approaches to code-switched data processing
- Crowdsourcing approaches for the annotation of code-switched data
- Machine translation approaches for code-switched data
- Position papers discussing the challenges of code-switched data to NLP techniques
- Methods for improving ASR in code switched data
- Survey papers of NLP research for code-switched data
- Sociolinguistic aspects of code-switching
- Sociopragmatic aspects of code-switching
Authors are invited to submit papers describing original, unpublished work in the topic areas listed above. Full papers should not exceed eight pages. Additionally, authors are invited to submit short papers not exceeding 4 pages. Short papers usually describe:
- a small, focused contribution;
- work in progress;
- a negative result;
- an opinion piece; or
- an interesting application nugget.
All papers can have up to 2 pages of references. All submissions must be in PDF format and must conform to the official ACL 2018 style guidelines:
- ACL Author Guidelines (see Paper Submission and Templates for templates).
The reviewing process will be blind and papers should not include the authors' names and affiliations. Each submission will be reviewed by at least three members of the program committee. Accepted papers will be published in the workshop proceedings and available at the ACL Anthology.
Multiple Submission Policy. Papers that have been or will be submitted to other meetings or publications are acceptable, but authors must indicate this information at submission time. If accepted, authors must notify the organizers as to whether the paper will be presented at the workshop or elsewhere.
Papers should be submitted electronically at https://www.softconf.com/acl2018/CALCS.