Shruti Rijhwani

Shruti Rijhwani
github | twitter | linkedin
google scholar | semantic scholar
About Me
Research Highlights
Academic Service
Publications
CV

I am a Research Scientist at Google. My interests lie in multilingual NLP and building better language technologies for under-represented languages.

Previously, I graduated with a Ph.D. from the Language Technologies Institute at Carnegie Mellon University, where I was advised by Graham Neubig. My research focused on developing models for multilingual and low-resource NLP, and my thesis presented improved optical character recognition (OCR) technologies for endangered languages. I was named to the Forbes 30 Under 30 in Science list for my work on NLP for endangered languages.

In the past, I’ve worked as a research intern at Bloomberg AI and as a research fellow at Microsoft Research.

More information about my work experience, publications, and academic service can be found in my CV.

I am best reached by email at shrutirijhwani@gmail.com. Feel free to reach out about my research or anything else I might be able to help with. I’m always happy to answer questions about getting started with NLP research and applying to Ph.D. programs.


Research Highlights

A full list of my publications can be found here.


Academic Service


Publications

Lexically-Aware Semi-Supervised Learning for OCR Post-Correction
S. Rijhwani, D. Rosenblum, A. Anastasopoulos, G. Neubig
TACL, 2021
[PDF] [Code+Data]

MasakhaNER: Named Entity Recognition for African Languages
D. I. Adelani et al., including S. Rijhwani
TACL, 2021
[PDF] [Code+Data]

Evaluating the Morphosyntactic Well-formedness of Generated Texts
A. Pratapa, A. Anastasopoulos, S. Rijhwani, A. Chaudhary et al.
EMNLP, 2021
[PDF] [Code+Data]

Dependency Induction Through the Lens of Visual Perception
R. Su, S. Rijhwani, H. Zhu, J. He, X. Wang, Y. Bisk, G. Neubig
CoNLL, 2021
[PDF] [Code+Data]

OCR Post-Correction for Endangered Language Texts
S. Rijhwani, A. Anastasopoulos, G. Neubig
EMNLP, 2020
[PDF] [Code+Data]

Soft Gazetteers for Low-Resource Named Entity Recognition
S. Rijhwani, S. Zhou, G. Neubig, J. Carbonell
ACL, 2020
[PDF] [Code+Data]

Temporally-Informed Analysis of Named Entity Recognition
S. Rijhwani and D. Preotiuc-Pietro
ACL, 2020
[PDF] [Data]

Improving Candidate Generation for Low-resource Cross-lingual Entity Linking
S. Zhou, S. Rijhwani, J. Wieting, J. Carbonell, G. Neubig
TACL, 2020
[PDF] [Code+Data]

AlloVera: A Multilingual Allophone Database
D. R. Mortensen, X. Li, P. Littell, A. Michaud, S. Rijhwani et al.
LREC, 2020
[PDF] [Data]

Damaged Type and Areopagitica’s Clandestine Printers
C. N. Warren, P. Wiliams, S. Rijhwani, M. G’Sell
Milton Studies, 2020
[PDF]

A Summary of the First Workshop on Language Technology for Language Documentation and Revitalization
G. Neubig, S. Rijhwani, A. Palmer, J. MacKenzie, H. Cruz, X. Li, M. Lee et al.
First Joint SLTU and CCURL Workshop, 2020
[PDF]

Practical Comparable Data Collection for Low-Resource Languages via Images
A. Madaan, S. Rijhwani, A. Anastasopoulos, Y. Yang, G. Neubig
Practical Machine Learning for Developing Countries Workshop, 2020
[PDF] [Code+Data]

Zero-shot Neural Transfer for Cross-lingual Entity Linking
S. Rijhwani, J. Xie, G. Neubig, J. Carbonell
AAAI, 2019
[PDF] [Code+Data]

Choosing Transfer Languages for Cross-Lingual Learning
Y. Lin, C. Chen, J. Lee, Z. Li, Y. Zhang, M. Xia, S. Rijhwani, J. He et al.
ACL, 2019
[PDF] [Code+Data]

Towards Zero-resource Cross-lingual Entity Linking
S. Zhou, S. Rijhwani, G. Neubig
Workshop on Deep Learning Approaches for Low-Resource NLP, 2019
[PDF] [Code+Data]

Parser Combinators for Tigrinya and Oromo Morphology
P. Littell, T. McCoy, N. Han, S. Rijhwani, Z. Sheikh, D. Mortensen, T. Mitamura, L. Levin
LREC, 2018
[PDF] [Code+Data]

Estimating Code-Switching on Twitter with a Novel Generalized Word-Level Language Detection Technique
S. Rijhwani, R. Sequiera, M. Choudhury, K. Bali, C. S. Maddila
ACL, 2017
[PDF]

Does the Geometry of Word Embeddings Help Document Classification? A Case Study on Persistent Homology-Based Representations
P. Michel*, A. Ravichander*, S. Rijhwani*
Second Workshop on Representation Learning for NLP, 2017
[PDF]

Code-Switching as a Social Act: The Case of Arabic Wikipedia Talk Pages
M. Yoder, S. Rijhwani, C. Rosé, L. Levin
Second Workshop on NLP and Computational Social Science, 2017
[PDF]

Understanding Language Preference for Expression of Opinion and Sentiment: What do Hindi-English Speakers do on Twitter?
K. Rudra, S. Rijhwani, R. Begum, K. Bali, M. Choudhury, N. Ganguly
EMNLP, 2016
[PDF]

Experiments with Cross-lingual Systems for Synthesis of Code-Mixed Text
S. Sitaram, S. K. Rallabandi, S. Rijhwani, A. W. Black
Ninth ISCA Speech Synthesis Workshop (SSW), 2016.
[PDF]