The Effectiveness of Input on Voiced and Unvoiced Consonant Sounds in a Native Arabic Speaking Context: An Action Research Project

The paper describes the process and outcomes of an action research project with the aim of determining whether focusing classroom input on voiced and unvoiced consonant sounds has a positive effect on their production. Statistics were derived from English-speaking respondents listening to native Arabic speaking participants from an experimental group, who had received input on the difference between these sounds, and practiced their production, as well as to speakers from a control group who had received neither input nor practice. The rates of intelligibility were compared, with the conclusion being that the provision of limited input on this pronunciation issue does not, generally speaking, result in the ability to produce the sounds with greater clarity.

contextualization was more beneficial regarding the former, due to time pressure, while others did not implement the strategy due to a consequence of their lexical knowledge being limited. Furthermore, it is stated that the mental effort required is also an issue.
Transferring phonological features from the mother tongue can cause issues with a listener's comprehension in an international context, which can be exacerbated if bottom-up approaches are used for speech processing. An example would be analyzing individual sounds to understand a turn, as opposed to implementing a top-down approach, which could involve utilizing background knowledge to make sense of a speaker and develop expectations about what would be heard.
To maximize mutual intelligibility, Jenkins (2000) advocates mastering the Lingua Franca Core, phonological features which cause intelligibility problems for an audience from differing language backgrounds. This includes all the consonant sounds, with the exception of /0/ and /ð/ which she recommends replacing with /f/ and /v/, and involves distinguishing between unvoiced and voiced sounds. However, native Arabic speakers have issues with such minimal pairs and struggle to differentiate between the two options in each, as their language doesn't contain the voiced sound (Alfehaid, 2015). Moreover, Hago and Khan (2015) comment on voicing being an issue in general, while Alfehaid (2015) is of the opinion that despite knowing the aforementioned minimal pairs are different, his compatriots still find production to be an issue due to their absence from the Arabic inventory. To confound matters regarding this, Smith (2007) states that /p/ and /b/ are also allophonic, while, unsurprisingly, /g/ and /k/ tend to be confused as well.
As mutual intelligibility is becoming increasingly pertinent due to the quantity of non-native speakernon-native speaker interaction (Kirkpatrick, 2010), "native speaker models have limited relevance" (Pickering, 2006, p. 219) and alternatives to a single native speaker model are required. Consequently, Jenkins (2000) is of the opinion that this needs to be taken into consideration as different norms apply when non-native speakers interact. Her thesis is based on the research conducted on intelligibility errors among non-native speakers and includes the avoidance of deviant core sound production. Such a feature is differentiating between the voiced and unvoiced consonant sounds, which is regarded as being integral in producing intelligible language, apart from the aforementioned /0/ and /ð/, and is claimed to be eminently teachable.

Background
Pronunciation instruction has been overlooked in language teaching to the extent that it has been called language input's "orphan" (Gilbert, 2010, p.1). This prolonged negligence has also meant that it has been said to be suffering from the "Cinderella Syndrome-kept behind doors and out of sight" (Celce-Murcia et al., 1996, p. 323). It may not be a regular classroom activity (Macdonald, 2002) if teachers feel inadequately prepared to teach it (Ur, 1996), as numerous teachers have stated a lack of knowledge on the issue (Moedjito, 2009), with many, including experienced teachers, not having undertaken phonetic training (Dauer, 2005). Furthermore, it may not be appropriately emphasized in curricula, with suitable materials being unavailable (Fraser, 2000;Macdonald, 2002 have taught the Headway series of pronunciation books, such as Bowler and Parminter (2002) page by page, a procedure which failed to address learner needs as the majority of the material covered was not an issue for the learners.
This neglect is in spite of a growing focus on communication in EFL, meaning the production of comprehensible and intelligible speech (Gilakjani, 2012). In this, listeners comprehend a turn and the speaker is aware of how difficult it is for their output to be understood, as commented on by Derwing and Munro (2005, p. 385), who state that such a focus is reasonable, and attainable, unlike attempting to perfect native-like pronunciation.
Though there is no single agreed upon pronunciation (Dauer, 2005), Saudi Arabian English is in English's outer circle. This is in contrast to the inner circle which is used for the norms of the language (Kachru, 2005), despite the fact that English belongs to the outer circle as much as anyone else, due to the sheer volume of non-native speakers. This has been estimated to be as high as 500 million in the countries where English is used institutionally in over 50 countries, with a further 500-1,000 million people in the expanding or extending circle. In contrast, there are a "mere" 320 to 380 million speakers in the inner circle (Crystal, 2003).
According to Jenkins (2000), inaccurate pronunciation has been claimed to be the most common reason for the loss of oral comprehension, with segmentals featuring prominently, including the inability to differentiate between voiced and unvoiced minimal pairs. She further states that such a distinction in oral production is integral in the maintenance of intelligibility in an international context, with the failure to produce the relevant sounds being a potential cause of unsuccessful communication in an international context. The issue is further exacerbated if the aforementioned bottom-up processing is implemented, as non-native speakers have less recourse to contextual or syntactical information, and are heavily reliant on word level interpretations (Field, 2004).
Furthermore, Derwing and Munro (1997) are of the belief that grammatical and prosodic proficiency are as prominent in comprehension as pronunciation, with other factors needing to be taken into consideration, including external noise (Rogers, Dalby, & Nishi, 2004), and lexical variations (Seidlhofer, 2003).
Consequently, the focus of the research was to determine if the provision of input on voiced and unvoiced consonants had a positive effect on the ability to comprehend non-native speakers.

Method
The participants in the research were native Arabic speaking pre-sessional Saudi Arabian male nationals at Prince Sultan University, Riyadh. As the participants' speaking and listening skills teacher, for the purpose of the research, input was justifiably provided over the duration of the 16-week semester on pronunciation features which prove difficult for Arabic speakers, such as discerning and producing voiced and unvoiced minimal pairs. This skill is of major concern to many learners, as, in Alfehaid's list of issues, 80% of speakers mentioned it (2015). The reason for this is because even though Modern Standard Arabic has 32 consonant phonemes (Smith, 2007), making it a consonant-heavy language, some English consonants are absent from its inventory, such as /v/, leading to the problematic nature of Arabic English (Watson, 2002). Smith (2007) also states that as Arabic does not exhibit /v/, there is a tendency to overuse its unvoiced equivalent /f/, despite the latter also being absent from the Arabic inventory.
Even the consonants which seem similar, such as /t/ or /k/, are different in manner and place of articulation (Al-Solami, 2013), as the English /t/ is alveolar and aspirated in the word initial position followed by a vowel, whereas in the same position, the Arabic equivalent is dental and non-aspirated (Tushyeh, 1996). Moreover, /d/ is always unreleased and voiceless in the word-final position in Arabic, meaning a word such as "played", tends to be pronounced as "plate" (Bauman-Waengler, 2009).
In an attempt to combat these issues, the International Phonemic Alphabet (IPA) was introduced through various websites (Appendix A), and classroom materials produced (Appendices B and C) for the promotion of learner autonomy. Activities were then undertaken which specifically focused on the contrast between voiced and unvoiced minimal pairs. These comprised a kinesthetic activity which had students placing their arms in the air if a word had a voiced consonant sound, or folding them if a word contained an unvoiced sound. These words were grouped, with the students told to discriminate between, for example, /f/ and /v/ in word pairs such as fail and veil. On top of this, Pronunciation Journey (Hancock, 2000), 50:50 (Appendix D), The Telephone Number Game (Appendix E) and Bingo (Appendix F) were conducted.
Students were randomly selected for both the experimental and control groups from the students who had volunteered to participate, and after the former group had received input and been provided with the opportunity to practice. A number from each group were recorded in order to negate the idiosyncrasies an individual may have shown, with their oral production played to non-Arabic speakers, with only some of them having been previously exposed to this variety of English.
The participants were recorded pronouncing the voiced and unvoiced minimal pairs. This was undertaken at the sentence level, using Appendix G, to prevent contextualization, which helps a listener distinguish output (Tennant, n.d.), though this claim has been disputed even with regard to speakers at the First Certificate in English (FCE) level (Jenkins, 2002). Listeners then provided feedback on what they had heard.

Results
The research results, as set out in Table 1, reveal that despite the input, the students from the experimental group did not outperform those from the control group, with the statistics being practically the same for both groups.  Control Group's incorrect answers = 288 (out of 994*) = 28.9% * There were 30 fewer responses for the Control Group, as a participant conducted the first task incorrectly, stating both options, as opposed to one. 5 listeners responded to this output, before it became apparent that this was the case, resulting in the discontinuation of the analysis of the turn.
As well as calculating the total number of correct and incorrect responses from the native English speakers, the results for each minimal pair were determined, as well as whether or not exposure to Arabic, and/or Arabic English, had played a part in the statistics.
Regarding the minimal pairs, which Cook (2008) states is a useful concept in pronunciation teaching, the data is provided in Table 2 below. However, it should be noted that the pairs were not afforded equal exposure. This was, in part, due to the absence of /dʒ/ and /tʃ/ from the options available in the second task the participants completed, The Telephone Number Activity (Appendix E). This was excluded as Hayden (1950) stated that /t/ is the second most common consonant sound, with /s/ fourth, /k/ eighth, /v/ eleventh, and /p/ twelfth, making them more common, while, more recently, Mines, Hanson and Shoup (1978) placed /t/, /s/ and /d/ in the top ten phonemes, which accounted for 47% of the 103,000 examples analysed in natural speech.
Also, although /θ/ and /ð/ have been commented on as being problematic by Pardede (n.d.), and Kharma and Hajjaj (1997), the unvoiced sound of the pair is the third least frequent consonant, with it, as well as its voiced equivalent, tending to be replaced in many dialects, such as Irish English, with /t/ and /d/ respectively (Cruttenden, 2013). Moreover, as previously stated, Jenkins (2000) is of the belief that mastering them is not a requirement for comprehension, and that the time spent on it is redundant.
The individualized nature of the task also played a part. For example, one telephone number required the distinction between /s/ and /z/ to be made on 5 occasions, while there weren't any instances of /k/ and /g/ in this particular turn. Furthermore, even though the students were told to provide a contrived telephone number, all began with 05, thereby increasing the number of /s/ and /b/ sounds, while the dearth of the numbers 8 and 9 limited the distinction between /f/ and /v/ having to be made.
Consequently, as well as the totals, the percentages are provided for perspective.  Though the absence of sounds from the Arabic inventory is commonly referred to as being the cause of the problem, such as with /p/ being pronounced as its voiced equivalent /b/ (Jenkins, 2009), perversely, Ababneh (2018) claims that an awareness of the persistent Arabic problem with this issue leads to overcompensation.
On the plus side, most agreement between the learners in the control group and their audience was regarding /t/ and /d/, given their need in accurately forming the past simple and past participle form of regular verbs. This is particularly germane if IELTS is going to be the future aim of the learners, given pronunciation's prominence in the speaking exam's grading. However, regarding the provision of input, it could only be claimed that this had had a positive effect on the production of /tʃ/ and /dʒ/, and /k/ and /g/, as these were the only pairs in which the Control Group outperformed the Experimental Group.
The 64 respondents from both sexes were all native English speakers with the exception of a single native speaker of each of the following; Albanian, Slovakian and French. Furthermore, 9 native speakers were bilingual, with Hindi, being the other language in question for a solitary person, with the remainder having Welsh as their shared mother tongue. They were from the UK, the USA, Australia, the Republic of Ireland, Canada and South Africa, with the majority from the first three.
It is worth noting that the data would have been analyzed according to whether the respondents' mother tongue was English, or not, if a significant number had been non-native speakers, in order to determine if this issue was significant.
With reference to listeners with experience of Arabic, and/or Arabic English, this meant at least a year in countries where it is the native language. The number fulfilling this criterion was 22, with the most experienced having spent 17 years in such countries. Those with no experience totalled 41, and added to this number was a listener with only 2 months of experience.
The group with experience of Arabic (English) where the listener disagreed with the speaker regarding the output totalled 302 of 1024 sounds. Those who had not been exposed to Arabic (English) performed similarly, with the disagreements numbering 288 from the analysis of 994 sounds. The resulting percentages were practically identical, as can be seen in Table 3 below, meaning the conclusion can be drawn, rather surprisingly it could be argued, that more extensive exposure to Arabic and/or Arabic English did not result in a better comprehension of learner output, as in the results of Bent and Bradlow (2003). In this work, the conclusion was drawn that listeners comprehend a native accent easier than a non-native accent, though comprehension of the latter does increase after exposure. Similarly, Clarke and Garrett (2004), who presented listeners with sentences spoken in their native accent, as well as a foreign accent, discovered that an initial delay in processing non-native speech rapidly decreases.

Discussion
The input on pronunciation was undertaken as the provision of its explicit instruction in the classroom has proved to be effective in work conducted by Couper (2006), Saito (2007), Kissling (2013) and Sturm (2013), amongst others, with effective pronunciation rendering a speaker intelligible even if errors are made in other subskills. This is according to Fraser (2000), who goes on to say that, on the other hand, poor pronunciation means a speaker is difficult to comprehend, with Thornbury (2006) stating that misunderstanding commonly ensues.
However, due to the input's inability to allow listeners to distinguish between the voiced and unvoiced consonant sounds of speakers who had received instruction on this problematic issue with greater accuracy, I shall discontinue providing material on this language feature in the Arabian university context. This is despite Rost (2005) stating that such input provides learners with the phonological knowledge required for language learning to take place, as long as the learners are motivated and pay attention to input. Unfortunately, such motivation tends to be absent, as the focus is on the passing of examinations, in a pre-sessional subject, English, which is taken due it being a pre-requisite for those who perform inadequately in the enrolment test, as opposed to being taken through choice.
Having said that, the lack of the input's effect may also have been due to the amount provided, as Ellis and Shintani (2014), and Nation (2007), have stated that learners need access to a sufficient quantity, in order for effective learning to occur.
Unfortunately, it is not a feasible proposition to provide the depth and detail undertaken by Pardede (2018), for example. In this research project, there was input on how to produce sounds, the provision of communication activities, with written versions of oral presentations and strategies for analysis being provided, and modelling and individual correction techniques also covered. Here, the teacher reported the results of the analyses of learner speech samples individually, including the annotation of sounds, and the diagnostic analysis of each turn, in tutorials. These were followed by individualized programs, before learners were recorded to contrast them with native models. Furthermore, Computer-Assisted Language Learning (CALL) was used, to promote autonomy by allowing students to hear their own errors and see their graphic representations. Recorded sound models were also used, as was self-monitoring and self-correction, with the final stage being the reading aloud of teacher feedback.
Explicit instruction on phonological form was provided to help learners notice the difference between their production and that of L1 speakers, as espoused by Derwing and Munro (2005). Also, explicit input was undertaken to correct erroneous pronunciation in the pre-test task, with it being presented in the Actions Implementation Report.
This study was conducted in 3 cycles, each of four stages, which were planning, actions, observation, and reflection. Overall, the action research was conducted in 23 sessions; the pre-test, implementing action, and 3 post-tests, with data collected via the tests, as well as questionnaires.
The pre-test, was transcribing students reading and turning them into phonetic transcriptions, followed by each student's phonetic transcription being compared to a native speaker's of the same passage. A rating was provided by comparing discrepancies with the native speaker, thus identifying the problematic features, with the post-tests administered at the end of each corresponding cycle in order to assess progress.
In 7 sessions, 35 activities were undertaken, which were deemed to have effectively enhanced pronunciation, and a positive attitude was observed among the participants, due to the opportunity to improve this language skill. This supports Dörnyei's belief that positive motivation plays an important role in language learning, and accentuates it (1998), while Yousofi and Naderfarjad (2015) showed that motivation correlates significantly with pronunciation.
Also, in the provision of input on intermediate English learners, the students receiving input on phonetic symbols and phonemic transcription performed better than those in the control group in a listening test (Khaghaninejad & Saber, 2015). Consequently, the results reveal that thorough input with detailed and explicit theory, practice and feedback is effective.
If the results had been different, and time was not of the essence, more games could have been integrated as implementing such a resource encourages learners to take an active role in their learning process (Crookall, 1990), and allows unconscious learning, as attention is on the activity (Cross, 2000).
Another resource could have been the use of film. Even though using this in the teaching of pronunciation has been found not to have resulted in major improvement, the learners in question still responded positively to its use, and it was found to have enhanced motivation (Handayani, 2017). Using such authentic materials also shows the relevance of listening skills in life. Thus, the provision of interesting and varied input which has not been artificially designed for teaching is beneficial, by developing self-confidence, as well as contextualizing learning (Vandergrift & Goh, 2012).
Technology could also have been exploited as there are numerous tools available for the teaching of pronunciation, which aid in learners' exposure to a variety of techniques (Moedjito, 2009). However, its use has been accused of having the potential to make the procedure more intimidating (Yoshida, 2018).
To conclude, in the context in which the input was conducted, I shall refrain from providing further pronunciation input due to the inability to cover the issue in a detailed manner, to a receptive audience (Suter, 1976) with a positive orientation to the language being learnt (Moyer, 2007).