The world is moving towards a data economy: data becomes a significant and important contributor to the creation of economic value added. Owners of data have the potential to make large profits, gain influence in society and contribute to a good life in a digital economy.
The step into the data economy requires utilizing new Artificial Intelligence (AI) technologies. Machine learning (ML) offers new tools to process the data and are necessary to turn data into something tangible with value for the owners of the data.
The aim of this project is to develop and use machine learning to improve the way how social scientists can approach classical as well as new questions in economics which require using large data sets.
The first part of the project will provide new substantive answers to classic economic questions since machine learning techniques allow overcoming shortcomings of previous approaches. The list of applications includes wage inequality across employers, industries, countries and gender as well as shed light on the (mis)allocation of talent in the economy and how policy can shape this allocation.
The second part of the project will address the new questions which became possible because of the digital transformation of the world. AI methods will allow companies to gather data to answer important business questions and boost their profits. Which consumers have similar preferences? Which goods do they buy? Where do they buy and at which price? Firms could use these data to gain market power to boost their profits. Does this involve detrimental effects for society? If so, this would require an appropriate response of the regulatory framework. ML could (and will) also be used in improving tax audits, inferring the value of a house and deciding about granting a credit and assessing the associated default probability. This project will develop and apply the required economics and ML tools to offer solutions to these future societal challenges.
The first contribution of the research I propose is a significant theoretical advance which establishes that available matched employer-employee data can reveal the latent characteristics of individual workers and firms without invoking standard but strong assumptions on human behavior. The assumptions can be relaxed since the project uses artificial intelligence methods to analyse the data.
The second contribution of the proposed research is to develop the machine learning algorithms for the proposed identification strategy. The objective is to develop computational tools that would enable researchers to estimate latent worker and firm productivities with no more complexity then is involved in estimating wage regressions with worker and firm fixed effects. The first step of the proposed computational algorithm builds on algorithms currently used by Airbnb, Netflix, Microsoft, etc. Here Google's 2013 breakthrough in Language Processing seems most promising. The heart of the algorithm will be a novel hierarchical clustering strategy that is transparent and driven by the local properties of the network connecting workers and firms implied by economic theory
The third contribution of the proposed research will be in applying the proposed methods to register matched employer-employee data from Denmark/Norway to obtain substantive answers to a number of important empirical questions,
including employer-size wage differences, inter-industry wage differences, misallocation and aggregate income, mismatch over the business cycle, international trade and unemployment insurance.
Finally, the computational tools help understanding firms' smart pricing strategies and developing the appropriate responses for government regulation to the challenges of the 21st century digital economy.