Linguistics, its Context and its Application

 

Linguistics as science

·       vs philology or engineering

·       systematic

·       exhaustive

·       theory-based/directed

·       how much science do we need?

 

Questions for linguistics

·       what is universal?

·       what is innate?

·       how can the non-innate parts be learned?

·       what are the units of language?

·       how is language capacity organized into components?

·       how can we represent each component?

·       how does language relate to meaning and thought?

 

Components

·       phonetics: vocal tract and spectra

·       phonology: sound systems for language

·       morphology: roots/stems and affixes

·       syntax: what can be said

·       semantics: looking up and computing meaning

·       discourse: beyond single sentences

·       pragmatics: how we use language

 

Data for linguistics

·       to the scientist/linguist

·       to the child (little linguist)

·       for machine learning of language

·       words, phrases, sentences or beyond

·       marginal, complex sentences

·       frequencies

·       meaning-utterance pairs: caregiverese

·       psycholinguistic experiments

 

Psycholinguistics

·       performance: is it just noise wrt/ competence?

·       understanding tasks; e.g., attachment

·       acquisition: sequences of corpora

·       elicited production

·       relation to general cognition

·       errors in L1 and L2 learning

·       L2 tests and MT evaluation


 

Modifiers of “linguistics”

·       psycho (see above)

·       socio- : class, subculture

·       politics of language and dialect

·       historical: what makes language evolve

·       comparative

 

Technical modifiers of “linguistics”

·       formal

·       statistical

·       computational

·       and what are NLP and HLT?

 

Applying language as input / output / knowledge

·       input: database front end (restaurants, airlines)

·       output: report generation from data (weather)

·       both: machine translation (web, repair manuals)

·       knowledge: information extraction

 

Issues of input / output / knowledge

 

·       i/o reversible, as in Prolog(?)

·       is NLG easier than NLU; if so, why?

·       habitability (NL unlimited as input)

·       limited KRs of apps (helps both ways)

 

Text versus speech

·       separate history

·       representation: FST vs (augmented or transformed) CFG

·       independent and (logically) sequential(?)

·       text-only applications: handicap aids, voice i/o

Probability and Statistics

·       now widely incorporated into symbolic systems

·       hidden Markov models for speech recognition

·       Bayesian spell-checking

·       probabilistic parsers and CFGs