Mentor: Dr. Eric Raimy
In 2004, Aaron Iba and Andrew Nevins of MIT presented a selectionist learner for reduplication(repetition of part of all of a word to modify its meaning) intended for natural language. Words were abstracted as phonemes, individual units of sound, connected by links, paths between pairs of phonemes. The preexisting links present in the unreduplicated form of the word were distinguished from the links made by reduplication. The learner learned by reading sets of unreduplicated and reduplicated pairs and trying every possible link and producing its result. It then evaluated the validness of what was produced and discarded all impossibilities, hence selectionist.
When only considering one link to go from the unreduplicated form (the root) to the reduplicated form, this is a relatively quick process. A search for a four letter root and eight letter reduplicated form considers 136 possible links, and finds 9 of those possible. But when two or three links are considered, the number of computations goes up exponentially, turning a couple second process into one that takes hours. When two links are considered for the same pair, the program has to search 18,632 possibilities, which of those 27 are possible. As Galan Pickard more thoroughly explains in a squib on this reduplicator, this method of search, also called 'the British Museum' manner, becomes extremely temporally expensive and is very inefficient. This type of search is notoriously the slowest of all searches.
This method is also very different from the process speakers are currently thought to use to recognize and produce reduplicated forms. I think a much more natural representation of the process is repetition recognition. First the reduplicated section is identified and then the possible underlying structures are considered. This approach would eliminate most of the links leading to invalid output from ever being considered and reduce the computing time significantly.
Also, this approach is algorithmic, producing results based on information. The existing approach simply tries everything. By using an approach that looks for repetition, it would not only reduce computation time, but could shed more light on how this process is done in the human brain. Should these changes improve the learner, in computing time or accuracy, it would lend support to to arguments that humans reduplicate language by seaching for patterns and repetition.