Co-Chairs: Deborah Hughes Hallett (Adjunct Professor of Public Policy, Harvard Kennedy School; Professor of Mathematics at the University of Arizona) and María C. Latorre Muñoz (Vice Dean for Research, Postgraduate Studies and International Affairs of the Faculty of Statistical Studies, Universidad Complutense de Madrid; RCC Research Fellow).
The amount of information existing in the world is growing exponentially, following an unstoppable trend. These data come from the most varied fields: social networks, commercial transactions, observable Internet activity, patient information contained in clinical databases, climatological observations, research on the cosmos from satellites and observatories, and so on. Data is the basic raw material in the same areas in which it is generated or in others. For example, information from social networks and Internet activity is useful for criminological research, in the fight against terrorism or in the creation of customer profiles useful for private companies. It is therefore necessary to search for and collect data, select them, process them, visualize them, interpret them, turn them into useful information and knowledge, and use them in decision-making or for policy advice.
The technologies needed to carry out these tasks are traditionally known as Data Analysis which has evolved to deal with Big Data, in reference to the enormous volume of information, and also assume that the information is constantly changing, is organized in different formats or is even unorganized, and its collection and processing must be done at high speed. Although data analysis is not new, the amount, quality, format and sources of information have changed in such a way that conventional data collection and processing techniques no longer work in this context. New techniques, methodologies and tools need to be developed, and it is currently a major research and societal challenge. To develop them, solid knowledge is required in statistics, computing, analysis and mathematics, giving rise to a new profile of researcher and professional called “data scientist”. In data science it is also necessary to know the working domains (economics, finance, marketing, sociology, medicine, meteorology, etc.), and to have organizational skills in information and communication, with the aim of disseminating what has been found in several research fields.
The study group on “Data Science” has a broad interdisciplinary character whose common denominator is research in some of the phases involved in the efficient treatment of information, understood in a broad sense (texts, images, spatial-temporal data, etc.). Our main lines of research are:
- Data analysis techniques and applications in the fields of economics, business, sociology, health and environmental sciences. Process optimization.
- Data modelling, processing and visualization: social networks, text, images, spatial, temporal and space-time data.
- Simulation and modelling of systems: applications to economics, biology, epidemiology and sociology.
- Information processing in a competitive environment.
- Data Mining & Business Intelligence techniques.
- Computer science and artificial intelligence. Programming and data management languages. Computational treatment of large volumes of data (Big Data). Mathematical methods and formal methods applied to information processing.
Daniel Gómez González (Director of the PhD Program in Data Science and the Research Group in “Data Science and Soft Computing for Social Analytics and Decision Aid” School of Statistical Studies, Universidad Complutense de Madrid)
Lorenzo Escot Mangas (Director of the research group "Data Analysis for Social and Gender Studies and Equality Policies” of the School of Statistical Studies, Universidad Complutense de Madrid)
Julio Lumbreras Martín (RCC UPM Representative, Visiting Scholar at the Harvard Mossavar-Rahmani Center for Business and Government at Harvard's Kennedy School of Government).