Javier Parapar

Information Retrieval Lab
Computer Science Department
University of A Coruña

Bio

I am an Associate Professor in the Computer Science Department at the University of A Coruña, Spain, and a member of the Information Retrieval Lab. My research interests include information retrieval, text mining, document engineering, text summarization, and recommender systems.

Recently, my work has focused on modeling item recommendation as a relevance ranking problem, exploring pseudo-relevance feedback for opinion mining, and applying text mining techniques for early risk prediction on the Internet. I have been organizing the eRisk Lab since 2017, where we specifically research early risk detection on social media.

For more information about my technology transfer activities, I invite you to visit the IRLab, which is part of our work with CITIC, the Research Center on Information and Communication Technologies at the University of A Coruña.

I completed my Ph.D. thesis under the supervision of Professor Álvaro Barreiro, focusing on new estimations and applications of Relevance-Based Language Models. After organizing the CERI 2014 conference in A Coruña, I was elected President of the Spanish Society for Information Retrieval (SERI). You can join us here.

Academic Positions

Present 2012

Associate Professor

University of A Coruña, Computer Science Faculty
Present 2022

Computer Science Eng. Degree Coordinator

University of A Coruña, Computer Science Faculty
2020 2020

Visiting Faculty Researcher

Google Research, London, UK
2020 2012

Assistant Professor

University of A Coruña, Computer Science Faculty
2018 2014

President of the SERI

Spanish Information Retrieval Society
2011 2009

María Barbeito, Predoctoral Grant

Xunta de Galicia, University of A Coruña, Computer Science Faculty
2009 2006

Research Assistant & PhD student

University of A Coruña, Computer Science Faculty
Summer 2005

Software Engineer intern

Igalia Software Engineering

Education & Training

Ph.D. 2013

Ph.D. in Computer Science

University of A Coruña, Computer Science Department
DEA2008

Advanced Studies Diploma in CS&AI

University of A Coruña, Computer Science Department
M.Sc. Eng.+B.Sc. Eng.2006

Ingeniero en Informática

University of A Coruña/University of West of England, Faculty of Computer Science/Faculty of Environment and Technology

Honors, Awards and Grants

ACM RecSys 2012

Best Short Paper Award

with Alejandro Bellogín: spectral clustering techniques have become one of the most popular clustering algorithms, mainly because of their simplicity and effectiveness. In this work, we make use of one of these techniques, Normalised Cut, in order to derive a cluster-based collaborative filtering algorithm which outperforms other standard techniques in the state-of-the-art in terms of ranking precision. We frame this technique as a method for neighbour selection, and we show its effectiveness when compared with other cluster-based methods. Furthermore, the performance of our method could be improved if standard similarity metrics -such as Pearson correlation- are also used when predicting the rating score. More...
2009-2012

María Barbeito Predoctoral Grant, Xunta de Galicia

The María Barbeito Predoctoral Grant Program, is a competitive call carried out annually by the Goverment of Galicia, Spain (Xunta de Galicia). The objetive of the program is to provide with the first step in the scientific carrer of young Galician researchers with the final aim of integrate those researchers in the Galicia R&D system after the defence of their Ph.D. thesis.

Research

Most of my research in framed on the Information Retrieval (IR) area. IR techniques have become essential for the daily activity of most of the human beings. Nowadays the homepage of almost every web browser installed in personal computers points to a web search engine such as Google, Yahoo! or Bing, this is not only for marketing purposes, but also, and more importantly, it is because today the search engines are vital to access information. And those search engines would not be possible without the research efforts made on the Information Retrieval ﬁeld. Information Retrieval is in fact the science of searching, or maybe a better description could be the science of finding

However, IR is not only searching for relevant documents, and neither my investigation is only on IR. In my research, I have also dealt with text mining and natural language processing techniques in tasks such as opinion mining and retrieval, text categorization, blog and news search, unsupervised and semi-supervised text categorization, pseudo-relevance feedback, item recommendation, document processing and engineering, retrieval of degraded information or text summarization.

I have contributed to the IR community reviewing articles for ACM SIGIR, ACM CIKM, ACM RecSys, WWW, BCS ECIR, SPIRE, ACM ICTIR, SERI CERI and journals such Elsevier Information Processing and Management, IEEE Transactions on Knowledge and Data Engineering, ACM Transactions on Information Systems, ACM Computing Surveys, Elsevier Data & Knowledge Engineering or the Journal of the American Society for Information Science and Technology. I am an editorial board member of Elsevier Information Processing and Management.

Interests

Pseudo relevance feedback
Clustering and cluster based retrieval
Early risk prediction
Recommender systems
Document processing and engineering
Text mining and summarization
Retrieval over degraded information

Co-authors and collaborators

Google Scholar

People with whom I have worked.

Most of my work would be not possible without the help of many group members and external collaborators.

Research Projects

PID2022-137061OB-C21

C3HS Content Curation for Consumer Health Search

One of the most pressing challenges in Information Access today is combating the spread of misinformation. Existing methods for misinformation detection employ techniques such as neural network models, statistical methods, linguistic analysis, and fact-checking strategies. However, the threat of false information has intensified with the emergence of highly creative language models. Misinformation on the web and social media poses significant social, economic, and political repercussions, leading to serious consequences such as election interference, polarization, and violence. This issue becomes particularly critical during global health crises, as misinformation surrounding the COVID-19 pandemic can result in catastrophic health outcomes. We have witnessed numerous myths proliferate on social media regarding COVID-19 treatments, the virality of the virus, and misleading narratives targeting marginalized communities. This challenge is especially pronounced in developing countries, where low literacy rates and limited exposure to technology hinder effective fake news detection. Nonetheless, increased access to affordable internet makes these populations more susceptible to believing and acting on misinformation.

Web search is a prevalent means of seeking online information, particularly regarding health-related advice. This area of web search is commonly referred to as Consumer Health Search. Accessing reliable health-related information necessitates retrieval algorithms capable of promoting trustworthy documents while filtering out unreliable ones. To achieve this, we aim to integrate various components, including query-document matching features, passage relevance estimation, reliability assessments, and suitable recommendation models. Our project seeks to establish a comprehensive pipeline for misinformation detection by fusing multiple features and complementary tools. We aspire to intelligently combine advanced techniques from diverse fields such as Information Retrieval, Text Classification, Recommendation, and Natural Language Processing to design effective content curation strategies for consumer health search tasks. This project is distinctly multidisciplinary, encompassing aspects of Text Processing (Information Retrieval, Automatic Text Classification, Personalization, and Recommender Systems), Computational Linguistics (Discourse Analysis, Advanced Natural Language Processing), and High-Performance Computing for Big Data. Furthermore, our team includes experts in Psychology, who will tackle challenges related to incorporating expert knowledge into the models, validating the resulting technologies, and applying the project outcomes in real-world contexts.
HORIZON-MSCA-2021-DN-01 GA 101073351

HYBRIDS: Hybrid Intelligence for Monitoring and Promoting Good Democracy Practices

False rumors, fake news, and hate speech against vulnerable minorities on social media are increasingly recognized as significant threats to democracies. A comprehensive global strategy to combat disinformation is essential, as open democratic societies depend on free citizens who can access verifiable information to form their own opinions on various political issues.

The primary scientific objective of the HYBRIDS project is to equip researchers with the knowledge necessary to design strategies and tools to address disinformation based on an in-depth analysis of public discourse.

There have been notable advancements in the automatic detection of disinformation using natural language processing and emerging artificial intelligence techniques in the fields of machine and deep learning. However, this remains a complex task that demands a high level of natural language understanding, inference, and reasoning. To enhance strategies for countering disinformation, HYBRIDS will integrate structured knowledge from social and human sciences into natural language processing tools and deep learning algorithms to develop new hybrid intelligence systems. The concept of Hybrid Intelligence entails the combination of machine and human intelligence to overcome the limitations of current artificial intelligence methods.

While hybrid systems are expected to become increasingly critical in the near future, there are very few experts capable of designing and developing such systems. This scarcity primarily arises from the multidisciplinary nature of the hybrid strategy and the challenge of finding researchers who are fully trained in traditionally distinct disciplines, such as computer engineering, social sciences, or linguistics. We believe the time is ripe to establish a Doctoral Network equipped to train researchers in hybrid methodologies for their application in social studies, with a focus on sustaining good democratic practices across Europe.
PLEC2021-007662

Big-eRisk: Early Prediction of Personal Risks in Large Datasets

Mental health is a critical component of the World Health Organization's definition of health. It directly influences how we think, feel, and behave. Mental disorders are complex and can manifest in various ways. In 2017, approximately 792 million people lived with some form of mental health issue, affecting more than one in ten individuals worldwide. Experts have recently warned that the aftermath of the COVID-19 pandemic could result in a global mental health crisis.

Despite the severity of these disorders, many individuals do not receive timely treatment. Early diagnosis is crucial for effective intervention, as it can significantly reduce the adverse effects of disorders and lower costs for public health and social services. However, tools for detecting mental health issues are limited due to the stigmatization surrounding mental illness.

Social media has emerged as a prominent communication platform, where many people share their emotions, thoughts, and feelings. The vast quantity of daily posts can enhance our understanding of individuals' mental states. Research indicates that analyzing language use in online data can help detect mental disorders. Social media provides a unique opportunity for individuals to express themselves anonymously, making it easier for them to share their true feelings and seek support from others.

Since 2017, we have been advancing this line of research through eRisk (https://erisk.irlab.org/), which explores evaluation methodologies, effectiveness metrics, and practical applications for early risk detection on the Internet. Over the last five years of this international competition, we have released numerous datasets related to risks such as depression, eating disorders, self-harm, and pathological gambling. Various international teams have contributed their models to promote this new area of research. Our ambition is to produce resources in evaluation methodologies, datasets, and models that can scale to the magnitude of social data. We envision that the results of this project will help develop the first generation of tools to assist social and health systems in early identification of individuals at risk.

To address the challenges beyond a laboratory setting, we have formed a solid interdisciplinary team composed of the lead organizers of the eRisk international competition, mental health professionals, and computer scientists with expertise in machine learning, information retrieval, natural language processing, and high-performance computing. The team, led by the University of A Coruña, includes the University of Santiago (a co-organizer of the eRisk competition) and Linknovate, a research-intensive start-up with strong ties to university teams.
RTI2018-093336-B-C22

Technologies for Early Detection of Signs of Psychological Disorders
This project focuses on several areas of Information Technology, including search, recommendation, massive data processing, and computational linguistics, alongside Psychology. The two subprojects each bring unique expertise in their respective domains.

Subproject 2 (UDC) will leverage experience in search, recommendation, and psychology to tackle a series of challenges and activities, including:
- Developing new methods and resources for evaluating information access systems, specifically for early prediction of signs of psychological disorders.
- Defining effective search and filtering methods to identify texts relevant to various profiles of psychological disorders, as well as creating models for topic analysis and its temporal evolution.
- Establishing methods for analyzing results, generating conclusions, and utilizing expert psychologists' knowledge to guide search, filtering, and recommendation components. These expert-driven activities will play a crucial role in validating the predictive technologies developed in this project.
- Developing content recommendation methods based on collaborative filtering—both content-based and model-based (linear models, latent models, embeddings)—tailored to the domain of psychological disorders.
TIN2015-64282-R

Probabilistic Personalized Infomation Access Systems

Recommender Systems (RecSys) aim, given a set of users, a set of items and a set of users' ratings to items, generate personalised item recommendations for users. Traditionally, RecSys can exploit information both from the past interaction of users and products and from the content of the items to generate new suggestions for users. These systems have proven key to facilitating access to information, products and services. Specifically, it is estimated that a significant percentage of e-commerce transactions are motivated by recommendations: for example, Amazon sales increased by 29% after integrating a recommendation engine.

The Spanish Strategy for Science, Technology and Innovation 2013-2020 establishes the need to provide Spanish companies with innovative models to increase efficiency and competitiveness of their processes of commercialisation of new products and services. Given the migration of our economy in the context of the digital society, these models must necessarily provide innovative technological solutions that will transform the way we do business and the sales channels or the mechanisms of relationship with the consumer. Given these objectives and challenges, Recommendation Systems for products and services play a central role. Thus, in this project, we want to advance the state of the art, proposing new models of recommendation that, with a solid formal probabilistic basis, may help increase sales and improve products and the satisfaction of buyers. These models and their translation into domains and instances of actual use in the business community contribute, through the quality of its recommendations, to the development of the digital economy.

A booming research area is the translation of classic Information Retrieval approaches to the Recommendation task. In particular, in this research project, we propose the use of probabilistic Language Models to the item recommendation task. Recently, we developed the first formalisations obtaining high effectiveness figures. Given the previous positive experience, we want to extend the predictability of these models beyond the collaborative filtering approach considering new estimates and models that include and integrate different content information, capturing contextual and temporal aspects. Furthermore, we propose the integration of Bayesian optimization techniques to develop models that not only generate tailored product suggestions but also generate them in a personalised manner, adapting the recommendation models to the particularities of the users. All these objectives are constrained by a common core objective which is transversal: efficiency, scalability and robustness of such methods in relation to their translation into real applications in the productive sector.
TIN2012-33867

Information Retrieval & Sentiment Analysis in Social Web

In many application domains, there is a growing need to exploit opinions expressed by people in the Web. Decision making processes in companies and organizations can be potentially enriched with software tools able to monitor the voice of the people about products/services, and able to estimate the customer satisfaction. Similarly, governments could promptly obtain the response of the citizens to political actions and, in broad terms, opinion-rich information can help in political decision making processes. On the other hand, users can take into account the opinions of others about products, services or any other issue that affects their information needs in public, private and professional domains.

In recent years, several research advances have been done in Web Information Retrieval (IR) and in the field of Opinion Mining and Sentiment Analysis. Analyzing and exploiting opinions from the web presents new challenges and needs techniques radically different from those of relevance-based retrieval, which is typical of web search. However, it is known that, for a sentiment mining and analysis system to be useful, effective topic retrieval should be available. This project proposes a number of complementary research lines in order to improve web retrieval and sentiment analysis. We will conduct research into models and techniques that have recently yielded promising results: improving pseudo-relevance feedback using three specific techniques: cluster based pseudo-relevance feedback, adaptive pseudo feedback and selective pseudo feedback; improving traditional techniques to detect opinions and to estimate polarity with new models of sentiment flow; and improving feature mining methods to associate opinions with aspects or properties of the reviewed objects. The team that proposes this project has experience on these research topics and has already made several contributions in these areas.
TIN2008-06566-C04-04

Information Retrieval on different media based on multidimensional models: Relevance, novelty, personalization and context.

There is a growing realisation that relevant information will be accessible increasingly across media, across contexts and across modalities. The retrieval of such information will depend on factors such as time, place, and history of interaction, task in hand, current user interests, etc. To achieve this, Information Retrieval (IR) models that go beyond the usual relevance-oriented approach will need to be developed so that they can be deployed effectively to enhance retrieval performance. This is important to meet the information access demands of today's users. As a matter of fact, the growing need to deliver information on request in a form that can be readily and easily digested continues to be a challenge.

In this coordinated project with the University of Granada, and Universidad Autónoma de Madrdid, we tackled the IR problem from a multidimensional perspective. Besides the dimension of relevance, we studied how to endow the systems with advanced capabilities for novelty detection, redundancy filtering, subtopic detection, personalization and context-based retrieval. These dimensions have been not only considered for the basic retrieval task but also for other tasks such as automatic summarization, document clustering and categorization. This research tried therefore to open new ways to improve the quality of access to sources of information.
07SIN005206PR

Improving news retrieval and access to financial information: web news retrieval

The objetive of this project was to improve NowOnWeb, and R&D platform for web-news retrieval developed by the IRLab. The tasks of the project were centred on one hand on efficiency improvements for faster indexing, more scalable query processing, and crawling engine improvements; and on the other hand on effectiveness improvements in terms of news relevance, construction of better summaries and exploitation of the query-logs.
TIN2005-08521-C02-02

Retrieval of relevant and novel sentences using IR models and techniques

The aim of this project is to improve the performance of systems for sentence retrieval and novelty. This task, located in the field of Information Retrieval (IR), is a step forward from the basic problem of document retrieval. Given a user query which retrieves an ordered set of documents, this set is processed to identify those sentences which are relevant to the query. This selection of sentences has to be done avoiding redundant material. The task defined in this way has been recently introduced in the field of IR (called “novelty task”) and it is highly related to other IR problems. Hence, lessons learned in novelty are potentially benefitial in other IR subareas. Moreover, the state of the art in novelty shows clearly the need of more research efforts for sentence retrieval and novelty detection. In this respect, several formalisms and tools which have been successfully applied in other IR problems are especially promising for novelty. This project will address the application of Language Models, fuzzy quantification and dimensionality reduction for sentence retrieval and novelty detection. We strongly believe that the variety of approaches taken is a good startpoint for improving the effectiveness of the novelty task. Furthermore, this facilitates crossfertilization between these research lines, which is an added value for this project. Across this document we will provide evidence on the adequacy of these models and techniques for novelty. The implementation of the proposals derived from this project will be based on research tools and platforms available for experimentation along with the development of our own code. The evaluation will be conducted with standard benchmarks and using the methodology of the field of IR.

Paula Lopez-Otero, Javier Parapar, Alvaro Barreiro

Journal Multimedia Tools and Applications, vol. 79, pp 7927-7949, 2020

Abstract

Query-by-example spoken document retrieval (QbESDR) consists in, given a collection of documents, computing how likely a spoken query is present in each document. This is usually done by means of pattern matching techniques based on dynamic time warping (DTW), which leads to acceptable results but is inefficient in terms of query processing time. In this paper, the use of probabilistic retrieval models for information retrieval is applied to the QbESDR scenario. First, each document is represented by means of a language model, as commonly done in information retrieval, obtained by estimating the probability of the different n-grams extracted from automatic phone transcriptions of the documents. Then, the score of a query given a document can be computed following the query likelihood retrieval model. Besides the adaptation of this model to QbESDR, this paper presents two techniques that aim at...

Working Notes Working Notes of the 8th International Conference of the CLEF Association, CLEF 2017, Dublin, Ireland, September 11–14, 2017

Abstract

Item-based relevance modelling of recommendations for getting rid of long tail products

Daniel Valcarce, Javier Parapar, Álvaro Barreiro

Journal Knowledge-Based Systems s, vol. 103, pp. 41-51, 2016 ISSN: 0950-7051

Abstract

Recommender systems are a growing research field due to its immense potential application for helping users to select products and services. Recommenders are useful in a broad range of domains such as films, music, books, restaurants, hotels, social networks, news, etc. Traditionally, recommenders tend to promote certain products or services of a company that are kind of popular among the communities of users. An important research concern is how to formulate recommender systems centred on those items that are not very popular: the long tail products. A special case of those items are the ones that are product of an overstocking by the vendor. Overstock, that is, the excess of inventory, is a source of revenue loss. In this paper, we propose that recommender systems can be used to liquidate long tail products maximising the business profit. First, we propose a formalisation for this task with the corresponding evaluation methodology and datasets. And, then, we design a specially tailored algorithm centred on getting rid of those unpopular products based on item relevance models. Comparison among existing proposals demonstrates that the advocated method is a significantly better algorithm for this task than other state-of-the-art techniques.

Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recommendation

Daniel Valcarce, Javier Parapar, Álvaro Barreiro

Conference Proceedings of the 38th European Conference on Information Retrieval, ECIR 2016, Padova, Italy, 20-23 March, 2016, Lecture Notes in Computer Science vol. 9626, pp 602-613 | ISBN: 978-3-319-30670-4

Abstract

Recently, Relevance-Based Language Models have been demonstrated as an effective Collaborative Filtering approach. Nevertheless, this family of Pseudo-Relevance Feedback techniques is computationally expensive for applying them to web-scale data. Also, they require the use of smoothing methods which need to be tuned. These facts lead us to study other similar techniques with better trade-offs between effectiveness and efficiency. Specifically, in this paper, we analyse the applicability to the recommendation task of four well-known query expansion techniques with multiple probability estimates. Moreover, we analyse the effect of neighbourhood length and devise a new probability estimate that takes into account this property yielding better recommendation rankings. Finally, we find that the proposed algorithms are dramatically faster than those based on Relevance-Based Language Models, they do not have any parameter to tune (apart from the ones of the neighbourhood) and they provide a better trade-off between accuracy and diversity/novelty.

Javier Parapar, Álvaro Barreiro

Book Chapter FECIES 2012 | ISBN: 978-84-695-6734-0

Chapter's book of 9th International Forum on the Quality Assessment of Higher Education and Research (FECIES 2012)

Loreto Del Río Bermúdez,Inmaculada Teva Álvarez (eds.)

Agrupamiento documental

M. Eduardo Ares, Javier Parapar, Álvaro Barreiro

Book Chapter RA-MA, September 2011 | ISBN: 978-84-9964-112-6

Recuperación de Información. Un enfoque práctico y multidisciplinar

F. Cacheda Seijo, J.M. Fernández-Luna and J. Huete (eds.)

This paper presents a new approach designed to reduce the computational load of the existing clustering algorithms by trimming down the documents size using fingerprinting methods. Thorough evaluation was performed over three different collections and considering four different metrics. The presented approach to document clustering achieved good values of effectiveness with considerable save in memory space and computation time

I am part of the the University of A Coruña Computer Science Deparment. I have been teaching several course on Computer Science Degrees in the last years, as well as specific course on different topics and technologies.

Current Teaching

Present 2016

Software Development Tools (614G01054) (Herramientas de Desarrollo)

Mandatory course for Software Engineering specialization (4th year) on the B.Sc. Eng. in Computer Science (OBL. EI 4º 2C SE).
Present 2021

Information Retrieval (614G02027) (Recuperación de Información)

Mandatory course (3rd year) for on the B.Sc. Data Science and Engineering (OBL. CED 3º 2C).
Present 2022

Recommender Systems (614G02044) (Sistemas de Recomendación)

Elective course (4th year) for on the B.Sc. Data Science and Engineering (OPT. CED 4º 2C).
Present 2013

Degree Projects in CS Eng (614G01227,614G01106) (Trabajos Fin de Grado)

B.Sc. Eng. in Computer Science Degree Projects in the Software Engineering and Computer Science specializations (4th year) (OBL. CS)(OBL. SE).
Present 2012

Information Retrieval and Semantic Web (614502010) (Recuperación de Información y Web Semántica)

Mandatory course on the M.Sc. Eng. in Computer Science (OBL. MsC EI 1C).

Past Teaching

2024 2017

Biomedical Knowledge Management (614522022) (Gestión del Cononocimiento Biomédico)

Elective course on the M.Sc. in Biomedical Informatics .
2023 2012

Information Systems Control (614G01044) (Calidad en Sistemas de Información)

Mandatory course for Information Systems specialization (3rd year) on the B.Sc. Eng. in Computer Science, elective course on the Information Technologies specialization (4th year) (OBL. EI 3º 2C IS/ OPT. EI 4º 2C IT).
2015 2015

Software Development Methodologies (614G01051) (Metodologías de Desarrollo)

Mandatory course for Software Engineering specialization (4th year) on the B.Sc. Eng. in Computer Science, elective course on the Information Systems specialization (4th year) (OBL. EI 4º 1C SE/ OPT. EI 4º 1C IS).
2015 2013

Software Development Tools (614G01054) (Herramientas de Desarrollo)

Mandatory course for Software Engineering specialization (4th year) on the B.Sc. Eng. in Computer Science (OBL. EI 4º 2C SE).
2013 2012

Degree Projects in B.Sc. Eng. (old plans) (Proyecto Fin de Carrera)

B.Sc. Eng. in Computer Science Degree Projects (old plans) in the Software Engineering and Information Technologies specializations (3rd year).
2012 2011

Information Technology Audit (614111607) (Auditoría Informática)

Elective course on the M.Sc. Eng. and B.Sc. Eng. in Computer Science (old plans, in extinction).
2012 2011

Programming II (614G01006) (Programación II)

Mandatory course (2nd year) on the B.Sc. Eng. in Computer Science.
2011 2010

Information Systems Design (614111403) (Diseño de Sistemas de Información)

Mandatory course (4rd year) on the M.Sc. Eng.+ B.Sc. Eng. in Computer Science (old plans, in extinction).
2011 2009

Artificial Intelligence (614211654,614311654) (Inteligencia Artificial)

Elective course on the B.Sc. Eng. in Computer Science (old plans, in extinction).
2011 2009

Cognitive Science (614211609,614311609) (Ciencia Cognitiva)

Elective course on the B.Sc. Eng. in Computer Science (old plans, in extinction).
2009 2008

Programming Technology (614211203) (Tecnología de la Programación)

Mandatory course on the B.Sc. Eng. in Computer Science (old plans, in extinction).

Other courses

May 2010

Persitencia y manejo de documentos XML en Java

Aula de Formación Informática
2009 2007

El sistema operativo Linux. Conceptos Básicos

Aula de Formación Informática
March 2007

El S.O. GNU/Linux. OpenOffice 2.0.

Consejo Social UDC
Nov. 2007

El S.O. GNU/Linux. OpenOffice 2.0)

Confederación de Empresarios de Ferrol

Contact & Meet Me

I would be happy to talk to you if you need my assistance in your research or whether you need help in relation with my research topics for your company.

lab: +34 881 01 1276
office: +34 881 01 1207
fax: +34 981 167 160
javierparapar@udc.es
parapar
@jparapar
es.linkedin.com/in/parapar

At My Lab

You can find me at my lab located on the Computer Science Faculty at the Elviña Campus:

Javier Parapar
Facultad de Informática, Campus de Elviña s/n
15071, A Coruña, Spain

At My Office

You also can find me at my office: S4.2 at Facultad de Informática, Campus de Elviña

Javier Parapar

Information Retrieval Lab

Javier Parapar

Bio

Academic Positions

Associate Professor

Computer Science Eng. Degree Coordinator

Visiting Faculty Researcher

Assistant Professor

President of the SERI

María Barbeito, Predoctoral Grant

Research Assistant & PhD student

Software Engineer intern

Education & Training

Honors, Awards and Grants

Research

Interests

Co-authors and collaborators

Álvaro Barreiro

University of A Coruña

David E. Losada

University of Santiago de Compostela

Alejandro Bellogín

Universidad Autónoma de Madrid

Fabio Crestani

University of Lugano

Daniel Valcarce

Google

Patricia Martin-Rodilla

CSIC

Anxo Pérez

University of A Coruña

Pablo Castells

Universidad Autónoma de Madrid

Alfonso Landin

University of A Coruña

David Otero

University of A Coruña

Filip Radlinski

Google Research

Paloma Piot

University of A Coruña

Isabel Moskowich

University of A Coruña

Ismael Hasan

University of A Coruña

Jorge Gabín

University of A Coruña / Linknovate

Silvia López Larrosa

University of A Coruña

Mark J. Carman

Monash University

Shima Gerani

University of Lugano

Mostafa Keikha

University of Massachusetts Amherst

José Manuel González Chenlo

University of Santiago de Compostela

People with whom I have worked.

Research Projects

PID2022-137061OB-C21

HORIZON-MSCA-2021-DN-01 GA 101073351

PLEC2021-007662

RTI2018-093336-B-C22

TIN2015-64282-R

TIN2012-33867

TIN2008-06566-C04-04

07SIN005206PR

TIN2005-08521-C02-02

Publications

Filter by type:

Decoding Hate: Exploring Language Models' Reactions to Hate Speech

Abstract

Enhancing Automatic Keyphrase Labelling with Text-to-Text Transfer Transformer (T5) Architecture: A Framework for Keyphrase Generation and Filtering

Abstract

Comparison of Clustering Algorithms for Knowledge Discovery in Social Media Publications: A Case Study of Mental Health Analysis

Abstract

Metahate: A dataset for unifying efforts on hate speech detection

Abstract

Explainable Depression Symptom Detection in Social Media