Caragea joins NSF as program director

NSF logo

Professor Cornelia Caragea is serving as a program director at the National Science Foundation (NSF) this academic year, in the Directorate for Computer and Information Science and Engineering (CISE). The directorate has an annual budget of over $1 billion, and advances research, innovation and education in computer science, information science, and computer engineering fields. CISE supports hundreds of academic and research organizations across the U.S.

According to UIC’s Office of the Vice Chancellor for Research, in 2023 the university received $32 million of its $509 million in funding from the NSF. Research funding from all sources allows UIC to conduct transformative research and enables the university to offer students–from undergraduate to graduate–an opportunity to participate in this work.

Caragea is tasked at NSF with managing a portfolio of grant proposals on natural language processing, information retrieval, and artificial intelligence. She will also conduct review panels and decide what research will be funded.

“I like serving the community, and identifying directions of research,” Caragea said. “I also like learning the mechanics of a big organization like NSF.”

Caragea began her yearlong appointment at NSF in August 2023, and has the option to extend her time with the agency for up to four years. Caragea said she participated in around three dozen review panels as an expert before joining the NSF, and this work as a panelist prepared her well to have a quick jump start at NSF as a program director.

She will begin spending more time in Washington, D.C., in the new year, spending half her time in the capitol and the other half in Chicago. Caragea has 15 PhD students she advises and wants to ensure they remain on track and continue the research and the mission of the group she leads—to pursue cutting edge research at the intersection of natural language processing, deep learning, and artificial intelligence.

One of the projects that Caragea and her students work on is text and image classification in areas where large annotated data is lacking, what’s known as low-resource settings. Deep learning models have shown substantial gains on many natural language processing and image tasks, but effectively employing such models depends on the presence of tens or even hundreds of thousands of labeled samples. The time-consuming and labor-intensive annotation process has made it challenging to obtain large, labeled datasets in many real-world scenarios, and unlabeled data can adversely affect accurate predictions, causing what’s known as noise.

Caragea and her students focus on the design of semi-supervised learning models that are not only robust to noise that inherently occurs from unlabeled data, but that are also able to detect potentially mislabeled examples that are harmful for learning. They aim to improve the trustworthiness of models in these settings, by calibrating models to reflect more uncertainty in their predictions, and call for human review in these cases.

“We as humans say, ‘oh, I’m not sure,’ or ‘maybe this is the answer, double check.’ We want the models to provide the same kind of feedback,” Caragea said. “If everything is predicted with a high probability, there is no way to differentiate between correct and incorrect predictions.”

Despite her increased workload, Caragea enjoys the supportive atmosphere and the collegial environment at the NSF and the work she does there.

“I’m extremely grateful to Dean Peter Nelson and my Department Head Robert Sloan for their help and support.” Caragea said. “Without their support I wouldn’t be here.”