The Existing and Potential Uses of Natural Language Processing in Parliaments
Author: Grant Vergottini, CEO at Xcential Legislative Technologies. Written on September, 2024
This publication is part of the book “Artificial Intelligence in Legislative Services: Principles for Effective Implementation”. To download the entire book, use the button below:
Introduction
The use of natural language processing in parliaments is often cited as one of the most promising applications of artificial intelligence related technologies. However, it is something that has been in use for many years. Legislation lends itself to natural language processing. In fact, it requires it. There are many good reasons for this. Amongst these are:
First, the corpus of legislation is very constrained, well written, and well structured.
Second, the vocabulary of parliamentary procedures is generally quite small, well controlled, and consistently applied.
Third, and perhaps most importantly, the parliamentary “machine” for legislating is well defined and tends to enforce a level of rigour that lends itself to good computability.
At the same time, there are obstacles to the adoption of some advanced technologies that must be addressed.
First, while parliamentary procedures are very well defined, they have been developed over many centuries. Today, many of these procedures are bound in place by legislative traditions ill-suited to modern automation. Changing deeply rooted traditions is always difficult, if not impossible. Instead, modern automation must adapt to the parliamentary procedures that exist, often limiting what can be achieved.
Second, while the language of legislation is well controlled, it has not been static. With time, it has evolved, leaving behind a corpus of documents defined by vocabularies that, while consistent at any point in time, are not consistent across time. Coupled with the difficulty that comes with documents bound as law and thus requiring legislative action to change, the tools must again adapt to the existing corpus rather than expecting the corpus to be revised.
Third, parliaments are generally risk adverse. With history going back much further than anyone’s memory, there is a concern that there are just too many unknowns in the procedures and documents a parliament manages to risk ever making any drastic changes – so change is always very measured.
Fourth, there is a high bar for accuracy. The language of legislation, as enacted, becomes the law of the land. Correcting errors usually requires further legislative action and is not well tolerated. While natural language processing has made leaps and bounds over the past few years, working with the precise standards of parliamentary procedures is a particularly difficult challenge.
Existing Uses of Natural Language Processing
Natural Language Processing has been used for several decades now to modernise how parliaments work. While some of these technologies are somewhat crude when compared to modern technologies, they have had to evolve around the challenges parliaments impose and have developed their own unique strengths that many of today’s general-purpose technologies do not have. Let us look at some of these existing applications of natural language processing.
Converting Existing Text
Legislative texts can date back centuries. Before any modernisation project can be undertaken, having a strategy to bring these texts into modern times is essential. These documents exist in all sorts of forms, from ancients scrolls of vellum to outdated digital files from obsolete computing systems. Even digital files can pose a challenge given the plethora of file format, character encodings, and even document models that might have been used.
Many parliaments have multiple legacy formats to contend with, from different eras of the past. When it is desired to bring the entire corpus of documents a parliament has into the modern world, the challenge can be vast.
The first challenge, after identifying the extent of the document set, is to figure out a way to normalise all the historical texts in a common digital medium. This itself is a challenge when the medium has decayed, the language is arcane, and even the writing, often handwriting, is difficult to read. While OCR technologies can sometimes be useful, this stage usually requires extensive manual check and establishing the digital representation as authentic is challenging. This process can take years to complete.
However, extracting the text into a digital representation is only the first step in the process of modernising the data. The next step is where natural language processing comes into play. Tools must be devised to tease out the legal semantics of words into a format fit for computer processing. Document structures, interdependencies, references, temporal relationships, definitions, terms, and other metadata must all be teased out of the text.
Existing tools can accomplish this sort of data conversion to a very high degree of accuracy, but at a considerable cost. Without well-defined specifications for how old texts were written, and yet a high degree of required accuracy, there is a lot of trial and error followed by extensive verification. Newer natural processing techniques promise to reduce the cost of data conversion considerably, but only if the accuracy bar can be met.
When it comes to the information representation of the legal semantics that are teased out of the document, it is necessary to have a document model specially designed for legislation. While there are several proprietary document models in use today, there is now also a standardised information model. It is the OASIS LegalDocML standard, also known as Akoma Ntoso. It provides the facilities necessary to record the structure, connectivity, and underlying metadata necessary as the basis for an industry of well-vetted (and trusted) tools based on AI-based technologies to emerge.
Amendment Impact
Another existing use of natural language processing is in amendment processing. Amendments, whether to statutes, codes, or even to bills can be very mechanical, proposing change in very terse and arcane ways. What cannot be readily seen is the impact that these amendments will have.
Amendment impact programs attempt to prospectively execute amendatory language to show the language of a law or a bill as it will look if the amendments are adopted. This can bring a degree of clarity to an amendment proposal that would otherwise be difficult to ascertain.
Standardising Language
While some parliaments have already adopted consistent language for specifying amendments, some jurisdictions wish to adopt this practice to enable future automation capabilities. Natural language processing techniques are employed to analyse the entire historical record of amendments and to identify usage patterns and all the variations. These techniques can then be used to define a complete vocabulary to be used by computer automations when automatically generating amendment language in future systems.
Identifying Conflicts
Natural language processing is also used to identify conflicts that may exist between bills. By prospectively executing amendments and then using language comparison techniques, conflicts become clearer earlier in the legislative cycle, obviating the need for complex and costly conflict resolution measures to be employed once legislation has been adopted.
Codification
One of the most complex activities a parliament can undertake is codification or re-codification. This is an essentially refactoring of existing law to make it more consistent, coherent, and easier to work with. Codification involves restating existing law in a new arrangement – sometimes after decades of chaos have settled into existing law. Sometimes this involves organising laws, found in yearly statute books, into codes arranged by topic. Other times, it involves reworking the codes to simplify or otherwise restate the law.
The complexity of this process means it can take many decades to complete. For example, the State of California began the process of codification in the late 1920s, and this process continued for the next 30 years. At the US Federal level recodification began in 1926 and continues to this day with no end in sight.
The process of codification is extraordinarily complex, ensuring that no substantive changes are made to the law in the process of restating the law. The paper trail that must be recorded is immense and must work against a dataset of law that does not stand still which this process is underway.
Natural language processing is increasingly being used to help expedite this difficult task, by improving search, tracking changes automatically, and helping to restructure or restate existing provisions.
Emerging Uses for Natural Language Processing
With the rapid advances in Artificial Language technologies over the past few years, including the field of natural language processing, several new opportunities to apply the technologies are opening.
However, caution is required when applying off-the-shelf technologies to the process of creating legislation. While the legislation produced by a parliament provides a vast and well-defined corpus of text to apply to natural language processing, the information model, behind the scenes is very complex.
Legislative documents exist in many states of being. Bills begin as drafts and, if enacted, end up as laws. Along the way, they are subject to a lot of debate which can result in numerous amendments which propose changes, sometime in conflict with one another. Understanding the complex lifecycle of a bill and the associated documentation is key to being able to provide effective tools to automate the process.
One of the most important aspects to consider is privacy. The attorneys that work on legislation often work with staffers and politicians in ways that are subject to attorney-client privilege. Whether it is a draft that is being prepared or an amendment to be proposed, there are times when the privacy veil must be protected. Often, the underlying policy document that can tie all the disparate documents that define a piece of legislation in totality is the very document that is subject to attorney-client privilege. When training a natural language processor, the dataset must always be provided in such a way that privacy does not inadvertently get pierced.
Many of today’s laws were enacted long before computing tools were available to automate the process. These laws were written in the paper era and still have many of the limitations that come as a result. Sometimes, information is missing or specified in complex ways that require a human with an extensive understanding of the context to follow. For example, a citation or reference to a provision of a law always has a temporal aspect, even if not explicitly stated.
Also, although parliaments are reluctant to admit it, there can be numerous errors that have never been caught and fixed. Sometimes there is ambiguity – such as when numbering gets inadvertently duplicated or reused. Often, manual procedures are in place to work around these complexities. It will fall upon any natural language processing tools to identify these situations and apply the same intricate procedures a human would to work around them.
The vocabulary of legislation is relatively small and quite precise. Unfortunately, this is not always a benefit. While most jurisdictions derive their procedures from a common set of core procedures defined by the early Westminster system, often these procedures have been warped and reshaped over time by individual parliaments to suit their own needs. Sometimes, the result has been to redefine terminology in ways that are both familiar and confusing. Configuring natural language processing technologies to be aware of these subtleties can be quite a challenge.
One of the key considerations of any tools introduced into parliamentary procedures must be the degree to which it improves the functioning of the organisation. Today, laws are crafted by attorneys with special qualifications to make the judgments necessary to enact valid laws. These practitioners of the law have very specialised knowledge of areas of the law. Tools must be provided that improve their effectiveness and efficiency rather than striving to replace them.
The result should be tools that act as a co-pilot rather than auto-pilot when it comes to drafting legislation. There is a risk here. Much as the calculator can take away an engineer’s intuition for numbers, too much of a reliance on a drafting co-pilot can lead to complacency in drafting. For an engineer or a software developer, errors that come from too eagerly accepting the computer’s suggestion are often quickly uncovered. For the drafting attorney, where the check and balances may be more subtle, these sorts of error might easily go unnoticed.
This leads to the next consideration – algorithmic transparency. For an application to be working alongside a drafting attorney in the role of a co-pilot, full transparency is a must to understand any inherent bias in its algorithms, whether intentional or not.
Policy Making
Much of the hype surrounding the application of Natural Language Processing and other AI-based technologies to legislation tends to be superficial, expressing the task of making law in generalities. This often derives from a superficial understanding of how legislation works. When considering the actual process of legislation, the application of new technologies becomes less clear. That is not to say there is no application for new AI-based natural language processing. Rather, it is an acknowledgement of the complexity of the problem.
Writing legislation is a complex series of steps. It starts with a decision on a policy change and ends with specific changes to the law to implement those policies. Along the way is a complex series of steps known as the legislative process, involving debate and commentary along the way. Let us look at the steps:
The first step is to identify, in plain language, the policy effect that is desired. There may be many stakeholders involved with differing opinions that will shape the overall policy. While this stage usually kicks off the process of creating legislation, it may continue throughout the legislative process up until the final version is put to the final vote. Much of this aspect is subject to privacy rules and cannot be used as source data for training an AI-based system. However, it is quite possible that an AI-based assistant can be used to formulate the policy, suggesting changes based on its understanding of existing law, the current political climate, and other factors available to it.
The next step is to translate the desired policy effect into the mechanical language of a bill. Depending on the legislative drafting practices adopted by a parliament, there might be several steps to take plain language policy effects and turn them into concrete changes in the law. This is where natural language processing could have a significant impact on the process. The process of translating plain language into specific amendments to existing law is particularly complex. However, as the original policy change may remain fluid, the process of tracking the relationship of plain language policy changes to concrete amendments to the law becomes quite complicated. Specialised natural language processing technologies are necessary to maintain these relationships, adapting them as new changes are proposed, or existing ones altered.
The next step opens even more complexity. Once a bill draft is introduced, it becomes subject to the amending cycle where many amendments, both supportive and not, might be introduced to the language of the bill. These amendments, which will be described in more detail later, need to be reversed through the process to understand not only the impact on the law, but also how they modify the policy changes being proposed.
Augmenting Drafting
Drafting attorneys are a key asset of a drafting office. Their knowledge of the law is invaluable. Many times, the experience they have amassed over the years can help guide them to draft effective legislation. It should be the goal of any AI-based system to augment the skills of drafters to allow them to be more effective as well as distributing the know-how of the drafting office to provide continuity in drafting capability as drafters retire or move on to other roles.
Specialised tools skilled in understanding the historical context of legislation, both past and present can provide valuable insights into what may or may not be politically acceptable, what the consequences of proposals might be, how likely a policy change might be invalidated or preempted due to constitutional or other higher-level laws.
Summarisation
Often, the mechanical language of legislation is accompanied by a summary or digest that explains the policy intent of the legislation. This is usually a sanitised summarisation of the policy documents and discussion that led to the legislation being proposed. The summarisation capabilities of natural language processing can naturally be used to produce this description, with appropriate safeguards to ensure that all privacy considerations are considered.
Analysing Comments
In some areas of legislating law, public commentary can play an important role. There are several ways in which natural language processing can play a role:
If the number of comments received is very large, natural language processing tools can sift through the comments to make sense of them before a human ever looks at them, arranging and grading them for processing.
Natural language tools can also be used to sift out comments introduced by troublemakers intending to gum up the process with noise or by laying traps to derail legislation.
Policy Simulation
There is often discussion about how the law might be simulated. This would help to do what-if analysis and understand the impact of various proposals and amendments. As a general capability, this remains a far-off dream. However, in specific areas such as tax policy, simulation models already exist. The real question is whether these models can be integrated with natural language processing to allow policy change proposals to be tested against these models in near real time by stakeholders lacking specialised know-how to drive these models
A related capability would be to apply experience data from similar legislation to predict the outcome of legislation – allowing various scenarios to be tested in the current political climate and improving the likelihood of a positive outcome.
Improving Search
One of the most obvious applications for natural language processing is to improve the search experience. Of course, allowing some to request a query using natural language is desirable, but there is so much more that can be offered.
Today, legislative information is distributed across a vast set of documents and databases. While tools are available to show the relationship between all these disparate data sources, it is quite primitive, lacking much of the semantic information that would allow an AI-based system to apply truly intelligent queries.
Data sources that can and should be integrated include statutes, codes, prior drafts (usually with privacy considerations), pending legislation, enacted information, analysis, votes, proposed and adopted amendments, and so on. The connections quickly become very complex.
Inter-Jurisdictional Harmonisation
Building these tools is beyond the expertise and budget of any single jurisdiction. For these tools to become a reality, there is a need for some level of cooperative industry of tool vendors to emerge. This is where having a more robust information model, such as that provided by LegalDocML, can be invaluable – paving the way for tools that integrate information across a parliament and between jurisdictions in truly smart ways.
Building these tools and technologies creates new challenges. Some legal scholars refer to a future they call the “legal singularity” – when all laws have been harmonised across jurisdictions, made complete, and made computer processable – almost as a computer program.
This is a long-term dream that is unlikely to ever be realised. Jurisdictions simply have little motivation to work on such a monumental undertaking given the immense cost and disruption it would cause.
Instead, new tools and technologies must be adapted to the local vocabulary and processes that already exist. This means two levels of training are necessary for a natural language processing system. The first is meta-level of understanding about how a parliament or jurisdiction specifies, creates, and disseminates their legislation. The second is an understanding of the laws that result.
Inter-Jurisdictional Cooperation
There are narrow subjects where some level of cooperation is essential because of the capabilities it would enable. Examples include the enforcement of treaties and other international agreements. Global challenges such as climate change, cryptocurrencies, social media, and financial and other regulations demand increasing cooperation among parliaments. Natural language processing tools offer capabilities that promise to reduce the friction of complying with these agreements.
Conclusion
Natural language processing is already a well-established technology employed by parliaments today to automate much of their modernisation initiatives and existing procedures. However, with the rapid advancement of Artificial Intelligence technologies in recent years, new opportunities to provide innovative solutions for parliaments are being enabled. The challenge is going to be to adapt off-the-shelf technologies to the specialised needs of parliaments where complex information models are the norm, where privacy considerations are mandatory, and where extreme accuracy is demanded. The goal must be to augment the drafting attorney, making them more effective, reducing bottlenecks, and providing drafting offices with continuity as drafters retire or move on after amassing decades of experience that they take with them.