Wednesday, December 9, 2009

Sentence Parsing

Parse diagrams are a pedagogical tool used in linguistics to analyze sentences and sentence structures. The underlying concepts are applicable to Arabic and serve as incredibly didactic exercises that help solidify a student’s understanding of the concepts covered in the grammar’s pith. Students can start using this tool, applying it to everyday sentences in order to analyze their structure, at any stage; it is not just for those who have completely studied the grammar core. Not only does this method of analysis help solidify grammatical concepts, but it places them at interesting angles that help gain insight into how the language works.


This parsing strategy is one among many pedagogical strategies. The task in this particular method is to start with a sentence and group words that belong together based on what the grammar dictates. Once the task is done, the result is a set of groups and subgroups that make up the logical structure of the sentence and its compound parts.

For  example, consider this sentence.

تهدّمتْ القنطرةُ العتيقةُ
The ancient archway became dilapidated

We start with the above, and end up with what is below.



The task of grouping is applied to the entire sentence. Each group is then isolated and we recursively apply the parsing method on each group. Notice that this results in a hierarchy. We can then take that hierarchy and represent it as a two dimensional tree if we so choose. Operations on these trees result in a deep understanding of how the language works. It is slightly more complex than that, but this is essentially what is known as X-Bar Theory. This tutorial does not discuss X-Bar Theory.


The primary goal of parsing a sentence using this strategy is to understand its meaning and sieve out any ambiguities. Once a person has analyzed a sentence using this method, it is nearly impossible to misunderstand the meaning, provided the parsing was done correctly.


The parsing is traditionally done by starting at the most abstract layer of the sentence. We identify the major portions of the sentence and group them. We then visit each group from right to left and recursively apply the same procedure until the most granular layer has been parsed. The most granular layer is the one in which each word is its own group. This process can be summarized as follows.

1.      if the portion currently under analysis is a single word, then stop
2.      otherwise, count the number of words in the portion
3.      identify the type of phrase/sentence and, consequently, how many major portions it can have
4.      group the words into these major portions
5.      for each portion from right to left, repeat from step 1

Let’s take another example.

جرتْ الخيلُ فقالتْ          حَبَطِقْطِقْ حبطقطق
The horse galloped and made the noise
Habatiqtiq, Habatiqtiq

The above is actually a conjunction of two sentences where the conjoining particle is the ف. So we start by grouping the above into three groups – the first sentence, the conjunction, and the second sentence.

ـقالتْ حَبَطِقْطِقْ حبطقطق
جرتْ الخيلُ
Clause 2
Clause 1

Now that that’s done, we find that the first group is not a single word. Therefore, we need to recursively apply the procedure to it. It is made up of a verb and the subject of that verb.

ـقالتْ حَبَطِقْطِقْ حبطقطق

Clause 2
Clause 1

If the subject was compound, we would have analyzed that further. But since it is not, we can safely leave Clause 1. The conjunction is, of course a single word so we will leave it. Clause 2, on the other hand, is made of the verb “to say” followed by the speech which is quoted.

حَبَطِقْطِقْ حبطقطق

lang=AR-SA dir=RTL style='font-size:16.0pt;font-family:"Traditional Arabic"'>ناسخ الإبتداء

Multiple Allowed



A nominal sentence must have a topic and at least one comment. In this minimal situation, the topic will be grouped as one group and the comment will be grouped as another, resulting in only two groups. If there are multiple comments, each will be treated as its own group. Consider the example below.

Comment 2
Comment 1

In rare situations, there is a separating pronoun between the topic and comment of the sentence. When this happens, there is typically only one comment (except in very rare occurrences). And so the sentence results in three sections – the topic, the separating pronoun, and the comment. An example follows.


In addition, nominal sentences can be abrogated by sentential abrogators. These abrogators may be either verbs or particles, but regardless of their affinity, the sentence is still parsed as a nominal sentence. The difference is that there is one extra component – namely the abrogator – and the major parts are given different names. Consider the examples below.

Predicate of كانت
Subject of كانت (hidden)

Predicate of ان
Subject of ان

Verbal Sentences
prepositional links
مفعول بهم
direct objects
[نائب] فاعل
[ergative] subject
Multiple Allowed
Multiple Allowed
Multiple Allowed


A verbal sentence must also have at least two components – the verb itself and the subject. If the verb is passive, then it must have an ergative subject in place of the subject. Consider the short example below.

Subject (hidden)

Optionally, a verb may have up to three direct objects depending on its level of transitivity. In the example below, the verb is passive, which means that what would have been the first direct object has taken the place of the subject. And since the verb is transitive to two objects, the second direct object is still visible.

2nd Direct Object
Ergative Subject (hidden)

In addition to potentially three direct objects, a verb may also have adverbs, each of which will form its own group in parsing the sentence. There are four types of adverbs in the language and a verb may have more than one of each (although this is rare). In the example below, the verb has a direct object as well as one of each adverb type.






And finally, a verb may have any number of adverbial and prepositional phrases linking to it. This will be discussed later in detail.
The Difference Between Hidden and Omitted
Words in Arabic sentences may be hidden as well as omitted. These two concepts seem identical prima facia, but they are wildly different. Hidden implies that the word is completely present in the sentence but it is simply not visible in script and pronunciation. As a result, it must absolutely be mentioned when parsing the sentence. An example of this is subjects of verbs; when a verb’s subject is not mentioned as an explicit noun or pronoun following the verb, the verb will be assumed to carry the subject within itself.

Omitted, on the other hand, implies that the word is no longer part of the sentence. For example, the verb “to hit” may or may not take a direct object – we can say “he hit” or “he hit his brother”. In the former case, the object of “hit” has been omitted. One of the rhetorical purposes of omitting the object in this case is to keep the verb general. If we say that the direct object is hidden as opposed to omitted, that implies that the object is still present and the rhetorical benefit has not been achieved.

Multilevel Parsing

Thus far we have focused largely on grouping the major portions of nominal and verbal sentences, and even conjoined sentences. Now, if a major portion is a single word, then our task is complete. If it is an entire sentence, we simply apply recursion. The hard part is when it is a phrase. There are a plethora of phrase types in Arabic, each with their own major portions and terminology. Furthermore, their major portions are highly susceptible to themselves being compound. An example of an embedded phrase and sentence follows.

ولا يلتام ما جرح اللسانُ
But that which the tongue injures does not mend

The above sentence is verbal; its major portions are the verb and the subject. The subject happens to be a phrase with an embedded sentence which is also verbal, and it, too, is comprised of a verb and its subject. Below is the parse diagram for the sentence, ignoring minor details.


Relative Clause
Relative Pronoun

لا يلتام

Only the predicate of a nominal sentence can be an entire sentence on its own. Other major portions of a sentence must either be single words or phrases. But most phrases have the potential of being comprised of embedded sentences, as seen in the example above. Part of mastering sentence parsing has to do with developing a keen understanding of these different phrases and how they work. For this purpose, one needs to learn Arabic through the medium of regular classes taught by esteemed professionals.

Prepositional Phrases

One of the most notorious elements in a sentence as far as parsing is concerned is the prepositional phrase. It is often quite difficult to determine how it will be grouped and a mistake in this can result in vastly divergent meanings. It is the sign of a powerful grammarian that he can seamlessly group prepositional phrases in parsing.

Consider this sentence, and try to determine its translation before moving forward.

غضِبتُ راغباً فيه عنه

This is a perfectly valid and harmonious sentence, but its translation is not so clear. Logically speaking, there are four options with respect to the two prepositional phrases. Either they can both be connected to the verb غضبت, they can both be connected to the participle راغبا, or one of them can be connected to one and the other to the other. In fact, it is also possible that they are connected to hidden words.

Based on common sense and what we know of Arabic lexicology and grammar, only two cases are likely here; either فيه is grouped with غضبت and عنه with راغب, or vice versa. In the former case, the translation of the sentence would be “I became angry regarding it, inclining away from it.” And in the latter case, the meaning afforded would be “I became angry [distancing myself] from him, while inclining towards it.”

One can clearly see the possibilities with respect to the perversion of meanings. But once we have determined to which word a prepositional is associated and with which it is grouped, there stands the question of how exactly to do the grouping.

1.      If the word to which the prepositional phrase is trying to link accepts such links, we can group them together immediately. Types of words that accept these links are:
a.       verbs
b.      gerunds
c.       active, passive, hyperbolic, and resembling participles (which are derived nouns)
d.      occasionally superlatives as well (which is also a derived noun)
2.      If the word does not fall into one of the mentioned categories, then
a.       if the word precedes the prepositional phrase, the phrase will link to a hidden word which it is able to link to and that hidden word will then become an adjective
b.      if the word follows the prepositional phrase, the phrase will link to a hidden word which it is able to link to and that hidden word will then become حال (circumstantial adverb)

