Large language model artificial intelligence, such as ChatGPT, Bard and others, has great potential in the world of international development. But – at least for now – it also has serious limitations.
It seems that everyone is talking about ChatGPT and its artificial intelligence “colleagues” and how they will change the way we work, including those of us who work in development. Some people say it will help us to solve the big global challenges - achieve inclusive and sustainable development for example. Others are concerned that it could have still undetermined negative impacts.
To help us find the answer, we decided to ask ChatGPT directly.
We started with the simple question: How can development projects be improved?
ChatGPT replied that improving development projects involves a complex set of actions and strategies that may vary depending on the specific project and its goals.
This is obvious. It sounds like the start of an uninteresting speech, but it continued:
ChatGPT continued to explain that development practitioners should ensure that the project objectives are clear, achievable, and aligned with the needs of a specific target population. Having well-defined goals will help to measure the impact of the project and improves accountability.
That is not earth-shattering news, but it is a bit better. It added that development practitioners should involve the community who benefit or are affected in the planning, implementation, and evaluation stages of the project. This may increase local ownership and sustainability of the project.
That is true and not unhelpful. To be sure, it is a bit abstract, and not exactly revelatory. But it is far from useless.
ChatGPT is a “large language model” – artificial intelligence designed to perform language tasks – that is trained on massive amounts of data. Google Translate is also a large language model. So is Siri, and so is Alexa. GPT, short for Generative Pre-Trained Transformer, is a particular kind of large language model, consisting of something akin to a machine brain that learns based on pattern recognition – and forms ‘neural nets’ that has billions of “neurons” or language patterns.
The text data on which it is trained comes from multiple sources, including websites, articles, and books, thus enabling it to predict the next word in a sequence based on probability. What makes ChatGPT “generative” is its ability to analyze or summarize material from its massive amounts of data to create new content in conversational language, and to continue a conversation with an actual human being.
The capacities of ChatGPT are changing and improving at an extraordinary rate. Six months from now, things might look radically different from how they look today. Our question is this: For development purposes, are we at the dawn of a new era?
Artificial intelligence will not replace current development practitioners, but they will be replaced one day by new staff who know how to work AI.
It should be noted that at least at the present time, large language models have serious limitations. For example, ChatGPT makes serious mistakes. It says inaccurately that people coauthored academic papers; and it refers to books that do not exist. There is also a risk that ChatGPT might replicate biases in the data on which it is trained. When we first asked it to name twenty famous economists, it named only men. True, most famous economists are men, but there are famous female economists). The second time we asked it that question, it included MIT development economist and Nobel laureate Esther Duflo.
To some questions, it offers misleading or inaccurate answers. We have asked it some technical questions, to which its answers are sometimes primitive and sometimes erroneous. For policymaking, the best conclusion is straightforward: use large language models with considerable caution and apply critical thinking.
ChatGPT is “noisy” in the technical sense that it does not offer the same answer to the same question. We asked it precisely the same question on multiple occasions, and we never received the same answer. It is a “non-deterministic” algorithm, which means that its answers will vary. It will also give different answers depending on how questions are asked. If you add for example “give me an expert view”, more detailed responses are provided.
We can hope that its responses to development-related questions are accurate, and that if they are not, they will become accurate over time; the mixed current record of ChatGPT suggests good reason for caution on this count, at least in the short-term.
There is a great deal of enthusiasm, even excitement, about the potential of generative AI, and understandably so. It is possible to obtain answers to challenging questions in a matter of seconds. For those interested in public health and safety, those answers can be valuable and impressively specific.
This could have impact on how fast low-income countries obtain and process information if they get access to the technology and if relevant information in local language is used to train the model.
To achieve these benefits, public sector agencies should focus on six things:
- Build foundations: Invest in data cleaning, quality control of information and knowledge and accuracy checks.
- Enhance understanding: Build capacity of public sector staff in data and information management, in critical thinking and in bias awareness in large language models. The technology is only as good as the data and information that it is trained with and the trainers need to be aware of biases.
- Develop public infrastructure: Build public digital infrastructure that enables staff and citizens to have access to ChatGPT and the like as a public good. This also requires data, information and knowledge management governance of systems that feed into LLMs like ChatGPT.
- Have a voice: Stand up in regional and global fora for equitable access to ChatGPT and similar platforms. This includes also standing up for language translations and access in local language.
- Plan for security: Ensure data security and private legal frameworks are developed to ensure generative AI solutions don’t create cybersecurity risks.
- Don’t trust but test, and keep humans in the loop: Start small and test applications to help staff learn how to work with large language models. ChatGPT is only as useful and good as the understanding of staff how to use it. Artificial intelligence will not replace current development practitioners, but they will be replaced one day by new staff who know how to work AI.
ChatGPT does not yet provide real guidance for development, and for development projects in particular its responses tend to be too abstract or platitudinous to be helpful. Still, these are early days, and we are looking forward to more conversations with ChatGPT.