Andrei's research area covers scalable semi-supervised learning with graphs, and combination thereof with other learning techniques, applied to NLP problems, particularly on resource-poor languages. His doctoral advisor has been Katrin Kirchhoff.
K. Kirchhoff and A. Alexandrescu, "Phonetic Classification Using Controlled Random Walks", Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011 [PDF] [BibTeX].
A. Alexandrescu and K. Kirchhoff, "Graph-Based Learning for Statistical Machine Translation", Proceedings of the North American Chapter of the Association for Computational Linguistics - Human Language Technologies (NAACL HLT) 2009 conference [PDF] [BibTeX].
A. Alexandrescu and K. Kirchhoff, "Graph-Based Learning for Phonetic Classification", Proceedings of The 2007 IEEE Automatic Speech Recognition and Understanding (ASRU) Workshop [PDF] [BibTeX]
Andrei's dissertation (a.k.a. thesis) is a tad unusual because it includes as-of-yet unpublished research (usually the doctoral dissertation is an expansion of the author's published research). Also, the text aims at being more readable than others, but it is unclear whether it succeeded at that. Doctoral dissertations are highly specialized and need to build extensive background, which inevitably ends up in referring to previous work that the putative reader would need to absorb in addition to the dissertation proper.
Below is a quick list of the contributions of the dissertation and other items of possible interest:
- An alternative proof of convergence for iterative label propagation. Compared to the original proof in Jerry Zhu's thesis [PDF] (pages 6-7), the proof in the dissertation rigorously uses only the minimal requirements for convergence, while remaining simple and terse.
- A data-driven approach to graph construction. That approach uses a supervised classifier that provides features for the graph-based learner.
- A framework for applying graph-based learning to structured inputs and outputs, in a formalization that is applicable to a large variety of tasks. Experiments are done on a real-world translation task.
- An in-place label propagation algorithm that is always faster than the original iterative algorithm (experimentally converges in roughly one third of the number of steps).
- A multicore label propagation algorithm that uses parallel processing and benign data races to distribute work on label propagation.
- A graph reduction algorithm that reduces the size of the graph by orders of magnitude without affecting the result of label propagation.
- A description and analysis of the inverted index data structure, along with an efficient search algorithm.
- A thorough overview of kd-trees with proofs and algorithms. The proofs include a fix of a 40-years old bug: in their seminal analysis of kd-trees (An algorithm for finding best matches in logarithmic expected time), Friedman, Bentley, and Finkel describe the requirements that a distance function must fulfill in order to be usable with kd-trees. These properties include symmetry and monotonicity, but oddly enough the authors forgot the essential requirement that the coordinate distance applied to identical values yields zero: fi(x, x) = 0 for all pertinent x in all dimensions i. Follow-up work on kd-trees also don't mention this constraint (of which necessity is obvious in hindsight). It was about time to fix that bug.
- An original algorithm called DynTrie that optimizes string kernel computations over a set of strings, which experimentally is three times faster than existing approaches.
- Descriptions of experiments conducted using label propagation on tasks in Natural Language Processing, Automatic Speech Recognition, and Machine Translation. The results show that the proposed approach improves significantly over state-of-the-art systems.