Monday, January 6, 2020

Understanding?

It took me a few days, but I recently finished the Quanta Magazine article, Machines Beat Humans on a Reading Test. But Do They Understand? There's something interestingly satisfying about reading a lengthy article over the course of several days. I encounter the reading like I do with a book, based on the day's events and my mood and appetite. Congratulations to John Pavlus for what I can tell, as a serious novice, is excellent reporting. I appreciate how he connected with so many different sources and asked questions of the technology and the promised progress.

There's a whole bunch of details in this article about AI and neural networks and technical pieces that I won't pretend to understand. But there are some fun things to think about from this article. Below are my favorite quotes followed by my mundane brain.

It might be the case that understanding language in a linear, sequential fashion is suboptimal.

From Jakob Uszkoreit, an engineer at Google Brain. How absolutely wonderful is that statement? If we want to train computers to learn language, or equip computers to learn language, why do they have to do it in the typical, linear, left-to-right sequence that humans use? Computer "brains" are much more powerful than our own. If they piece language and words and sentences together in both directions, wouldn't they learn faster and have a higher probability of being correct?

It's a bit counterintuitive, but it is rooted in results from linguistics, which has for a long time looked at treelike models of language. 

Again from Jakob Uszkoreit. In this case he's referring to how the nonsequential processing of sentences produced treelike expressions. The neural network made connections between words throughout text, much like a tree or a sentence that was diagrammed by an elementary student. It allows for "associations between words that might be far away from each other in complex sentences." So, looking at text from both directions allowed the computer to create treelike structures of the sentences in order to learn the syntax of the language, to increase it's capabilities in natural language processing (NLP). Is this not why we ask students to diagram sentences? It's based on teh postulation that there's a static, highly structured task and expectation of what natural language is, and in order to engage in natural language you must understand how each word works and connects throughout a sentence. If we agree to the starting point and the end goal, sentence diagramming is worthwhile. It's teaching computers how to learn our own language at a faster rate that we can learn it. Why can't it work for us?

It seems like we have a model that has really learned something substantial about language, but it's definitely not understanding English in a comprehensive and robust way.

It's okay everyone. You can catch your breath. (That's from Sam Bowman, a computational linguist at New York University.)

According to Yejin Choi, a computer scientist at the University of Washington and the Allen Institute, one way to encourage progress toward robust understanding is to focus not just on building a better BERT, but also on designing better benchmarks and training data that lower the possibility of Clever Hans-style cheating.

This is sort of a confusing statement, but it lets me go off about education again. The people who are trying to enhance artificial intelligence--they're trying to propel the capabilities of computer learning--understand that formative assessment and responsive instruction are the methods for improving learning. Building a better brain won't stop cheating. Building a better student won't fix education. Building a better BERT won't prove much until the benchmarks and training are to a similarly high quality.

Bowman points out that it's hard to know how we would ever be fully convinced that a neural network achieves anything like real understanding. Standardized tests, after all, are supposed to reveal something intrinsic and generalizable about the test-taker's knowledge. But as anyone who has taken an SAT prep course knows, tests can be gamed. 

This is where I really was won over by the author, John Pavlus. He moved through a bit of history and some serious explication regarding NLPs and the tasks to assess AI capabilities, and eventually he came around to the quality of the assessment, the bias of the assessment, and the capabilities of a computer to learn how to do well on a specific test/task. And he pushes the issue by asking the validity of the results of the assessments. We're making newer and harder tests. Are the computers getting better at taking those tests at faster rates than humans? Do the results really tell us anything? This is my question every time we promote the ACT or some other standardized test. We subject every student to a standardized test to measure their learning, to judge the quality of instruction, the score a school and district. But what are the results really telling us?

We're definitely in an era where the goal is to keep coming up with harder problems that represent language understanding, and keep figuring out how to solve those problems.

Again from Sam Bowman. Why can't this be our drive as humans? Why can't we live in a time, every time, the present, when we're coming up with harder problems that represent improvements in our quality of life, and then keep figuring out how to solve them? One of those problems might be enhancing AI capabilities in language acquisition and understanding. Another of those problems might be health care in remote areas. Why can't we measure our success on those types of problems, instead of some standardized questions in a timed assessment?

Read the article. Even if it takes you multiple days, you'll have new thoughts and ask yourself some interesting questions. And the writing will be much better than this.

No comments:

Post a Comment