Dynamic assessment on high school biology students’ reasoning skills

Article history Received: 25 October 2020 Revised: 14 April 2021 Accepted: 25 April 2021 Reasoning skills of students may be improved through a sort of feedback constructed in the dynamic assessment. The research aims to analyze the effect of dynamic assessment on high school biology students' reasoning skills. The research employs a quasi-experimental method with a pretest-posttest control group design. The participants were students of the grade X Science Program (N=61), which were selected purposively. Data on reasoning skills were collected by pretest and posttest focused on the concepts of bacteria. with the dynamic assessment method. The data analysis technique employs an Independent sample t-test. The research result indicates that Dynamic Assessment provides a better effect on students’ reasoning skills in Bacteria material.


INTRODUCTION
. DA has the potential to train and improve the change abilities of students, including the progress of reasoning skills. Repeated evaluations in DA provide the opportunities to assess learning at the micro-level and provide information and advice for learning at the macro level in terms of assessing the reasoning skills. It allows the teachers to facilitate learning with feedback to students and improve learning efficiency. It because they can recognize the difficulties experienced by the students and help to solve these problems (Stevenson, Bergwerff, Heiser, & Resing, 2013).
The most dominant approached used in DA were the "sandwich format" and "cake format". The sandwich format has the steps started with the pretest, mediation phase, and posttest. While the cake format is the intervention or assistance is administered in the assessment format and learners will receive assistance per item of question that they found difficult. The form of assistance in the cake format is usually script or often a menu of hints prompts or clues (Grigorenko, 2009;Grigorenko & Sternberg, 1998). The "cake format" was adopted in this research. In this approach, DA was implemented using the prompts, hints, or cues given when students accomplish the assessment process. The researcher mediated the students to identify and handle errors on each test or the question (Shabani, 2016).
Research on DA mostly aims to improve cognitive abilities (Kovalčíková, 2015), language abilities of children, reading skills, and some research done in mathematics as well (Stevenson, Heiser, & Resing, 2016;Tzuriel & Caspi, 2017). While DA research on biology topics is rare. Some prominent works on biology concepts, such as web-based dynamic assessment focusing on photosynthesis which was done consecutively by Wang from National Tsing Hua University, Taiwan (Lin & Wang, 2017;Wang, Wu, Yu, & Lin, 2015;Wang, 2011Wang, , 2018Wang, Wang, & Huang, 2008), and ecology did by Taiwanese scholars as well (Hung, Hwang, Su, & Lin, 2012). Therefore, there is still a wide challenge to develop DA on other topics in biology.
This research focused on the two aspects of reasoning skills: correlational reasoning and combinatorial reasoning. Correlational reasoning is the ability to recognize causal relationships between variables. Meanwhile, combinatorial reasoning is the ability to consider several factors or combinations to conclude (Shofiyah & Wulandari, 2018). The effect of dynamic assessment on reasoning skills has not been studied widely. There is one research on this topic done by van der Graaf and team on the scientific reasoning of kindergarten students There are four methods to implement dynamic assessment: clinical interviews, testing the limits, graduated prompting, and pre-test-train-post-test (Kovalčíková, 2015). Clinical interviews, testing the limits, and pre-test-train-post-test are the common approaches used for language and mathematics. And graduated prompting is used by Wang in his web-based dynamic assessment on the photosynthesis Novitasari, Ramli, & Karyanto (2018) have developed the basis of DA, called The Facts and Proofs Diagnostic Test (FPD) and Structural Communication Grid Test (SCG) on the concepts of bacteria. The result showed the FPD and SCG can be used to detect students' conceptual understanding, misconceptions, and argumentation abilities. However, The Facts and Proofs Diagnostic Test and the Structural Communication Grid Test are still categorized as static assessments. We adopt some questions regarding bacteria from these instruments, and some modified from the Khan Academy to develop a dynamic assessment on bacteria, and further checked the effectiveness of this DA instrument to the reasoning skills of students.

Research Design
The research was quasi-experimental. The research learning design is illustrated in Table  1. The research design used was pretest and posttest control group design. The research independent variable was dynamic assessment, whereas the dependent variable was reasoning skills.

Population and Samples
The research was conducted at one of the public senior high schools in Surakarta in grade X of the Science Program of the academic year 2018/2019 with a population of 220 students. Samples were selected purposively, i.e., the grade who were learning the topic of bacteria at the time research was done. Two classes of grade X science courses were randomly selected as the experiment (N = 30) and control class (N=31). All students of both classes had agreed to join the research as participants.

Instrument
The research instrument was employed to measure the reasoning skills of students which covered the concepts learned on the topic of bacteria. The test was focused to train correlational and combinational reasoning. The instrument consists of 53 items with multiple choices and there are prompts for every item which treated to the experimental class, which can be accomplished within 45 minutes. A construct and content validity test by an expert was conducted in advance. An empirical validity test of the instrument items done by the Pearson Product Moment formula resulted from r > rtable with a minimum range of 0.89. It indicated that the items were valid. Further, a reliability test was conducted using that generated value of 0.25 > 0.21 which means that the instrument was reliable (Khasanah, Ramli, & Dwiastuti, 2020).

Procedure
In the first stage, students in both classes were asked to work on the pre-test questions which are formatted in Google form. Next, the experiment class students were invited to work on the same problem as in the pre-test stage. Students who have answered all the questions were able to click the view score button to see the given feedback whether they gave correct answers or incorrect ones.
If students choose the correct answers, then there was feedback to confirm that their answers were correct. If students make mistakes, feedback was given as the graduated prompting. The prompts were the questions about bacteria given once to direct the students towards the correct answer.
After two weeks, the students of both classes were asked to work on the post-test questions. Experimental class students get prompting as shown in Figures 4 and 5 to help students gave the correct answer, while the control class students did not get prompting. The pretest and post-test questions were the same items. Pre-test results from the experimental class were used to classify students' reasoning skills into high, mediocre, and low categories.

Data Analysis Techniques
Three types of data analysis were applied, i.e., normality test with Shapiro-Wilk Test, homogeneity test with Levene Test, and hypothesis testing used one-way ANCOVA. The normality test suggested that the data were normally distributed in the experimental and the control class (Table 2). The significance value of the experimental class versus the control class in the pretest shows that the score of the pretest is bigger than 0.05. Furthermore, the significance value of the experimental class versus the control class in the posttest indicates that the significance probability value of the pretest is bigger than 0.05. The significance probability value > 0.05 means H0 is accepted. The result of the normality test (Table 2) suggests that the p-value or significance of pretest and post-test in the experimental and control class was more significant than α = 0.05; thus, the reasoning skills data were normally distributed.  Based on Table 3, the statistic test showed a significance of 0.0816 > 0.05, or H0 was accepted. Therefore, it could be interpreted that the pretest in the experimental and the control class were homogenous. Table 4 shows the average pretest score of the control class was 31.77, while the experiment class was 29.43. The initial ability of both classes was tested using the independent sample t-test. T-test results showed the sig. value of 0.148 > 0.05, which means no significant differences in reasoning skills between the control class pretest and the experiment class. According to data of the experiment and control classes in Table 4, the pretest score, which is a covariate, had a significant correlation with the post-test score after treatment since the pvalue or significance was smaller than α = 0.05. It can be concluded that H0 (dynamic assessment does not affect or increase students' reasoning skills) was rejected and H1 (dynamic assessment affects students' reasoning skills) was accepted. Thus, dynamic assessment significantly affected high school students' reasoning skills. The hypothesis test was One-Way ANCOVA with α = 0.05. Table 5 indicates that the p-value is 0.000 < 0.05, thus H0 was rejected. It means at a confidence level of 95%, there was an influence of dynamic assessment and non-dynamic assessment (static assessment) on reasoning skill post-test scores. It suggested that the post-test scores were indeed due to the influence of implemented the DA, instead of student experience or because of their prior knowledge. Thus, dynamic assessment influenced the increase in students' reasoning skills.

RESULTS AND DISCUSSION
This research showed the intervention of dynamic assessment helps students to answer questions about the concepts of bacteria by including correlational reasoning and combinatorial reasoning indicators. DA provides the opportunity to give intervention to students and lead them to the correct answer. In this study, the interventions were given as several prompts for each question. The first prompt provides little information about the solution to the question. The subsequent prompts bring more information, and finally, students can answer correctly. Prompts were given gradually from the general to the more specific ones for the subsequent prompts. It helps and guides students to achieve the correct answer (Wang, 2010).
The dynamic assessment also applies Vygotsky's theory of the Zone of Proximal Development (ZPD). It was assumed that the students can be helped to gain achievement by providing them with assistance. Also, prompting in dynamic assessment help students develop their conceptual understanding and answer the questions correctly (Khaghaninejad, 2015;Poehner & Infante, 2016).
The following were examples of questions accompanied with a discussion of concept indicators and prompting given to the experimental class: Q4: Bacteria differ from viruses, protists, and Animalia. Based on the level of the organism, which one of the following statements describes the level of the correct organism?
A. Q8e: Bacteria have many unique ways to reproduce. There are several ways of bacterial reproduction, such as binary division, budding, fragmentation, and DNA recombination (transformation, conjugation, and transduction). One of the bacterial propagation processes is that the tip of the bacterial cell develops into a bud and is then followed by genomic replication in the bud. Buds develop and release new bacteria. What is the breeding method described in those statements?
Prompting: For Q4, six students from the low achievement (LA) group of the experimental class answered correctly after getting prompts. Before the treatment of DA, only one out of those six gave the correct answers. While, for Q7a, six students from the LA group of the experimental class answered correctly after getting prompts. Previously, only five out of those six gave the correct answers. Moreover, six students from the LA group of the experimental class answered correctly after getting prompts for Q8e. Before DA treatment, none gave the correct answers.
Changes because of prompting were appeared in students who gave the incorrect answer in the pre-test, they gave the correct answers at the post-test stage. Some students who have given a correct answer during pre-tests can still give the correct answers after prompts were given. Some students were unable to give the correct answers even after prompts were provided. Prompting for Q4 helped the students to develop a conceptual understanding of distinguishing between bacteria, viruses, and protists based on their structure. Prompting for Q7a helped the students to develop an understanding to distinguish between heterotrophic and autotrophic bacteria. While prompting for Q8e helped students to construct the concept of differentiating the bacteria reproduction methods. Students can give the correct answers because the prompt helped them to remember the previous concepts and reconstruct their concepts.
An example of changes that occurred in one student of the lower-class experimental class is Student Number 10 (S10). S10 got a score of 19 during the pre-test. Prompting resulted in S10 got a higher score during the post-test (48). S10 initially gave wrong answers to some items during the pretest (such as Q3g, Q3h, Q4, Q5c, Q5f, Q6a, Q6d, Q6e, Q7e, Q7h, Q8a, Q8b, Q8e, Q8f, Q8g, Q9, Q10, Q12b, Q12c, Q12d, Q12e, Q12f, Q12g, Q13a, Q13c, Q13d, Q13e, Q13i, and Q13k). Prompting helped S10 to give the correct answers during the post-test. Changes in students' answers showed that dynamic assessment with prompting is effective to help students improve their achievement. The effective prompt should be matched with ZPD. It means the prompt can guide students to reach their optimal potential abilities (Navarro & Mourgues, 2018).
The control class was not provided with the dynamic assessment and was only asked to study the topic of bacteria independently. Due to the absence of scaffolding or assistance, the students at the control class did not experience feedback to bring them into the correct answer, they had lower post-test scores than the experimental class. Feedback is an important part of AFL, especially for some students who are categorized as ones who need scaffolding or assistance to understand the concepts comprehensively (Eremina & Reginald, 2016). The dynamic assessment had advocated as an assessment that can differentiate students' proficiency (Feng & Heffernan, 2010).
The results showed that there was a difference between reasoning skills between students provided with the dynamic assessment and the ones without the dynamic assessment. The experimental class has a higher reasoning skills score and differs significantly from the control class. The reasoning skills of students also improved through the DA or the prompting of DA is actually as feedback for students in the concept of assessment for learning. By controlling the abilities of students on scientific reasoning in the format of the experiment design, Graaf and the team also found the effectiveness of assistance on students' upgrading skills (van der Graaf, Segers, & Verhoeven, 2015). It also conveyed the theory of Vigotsky on zone proximal development (Clarà, 2017;Poehner, 2017).

CONCLUSION
The research revealed that dynamic assessment with graduated prompting affects students' reasoning skills and also their understandings of the concepts about bacteria. Thus, it is promising to do the same research on other levels of schooling, biology topics, and types of thinking skills as well. However, there are some limitations of the dynamic assessment instrument which is developed in the platform of google form quiz, i.e., the non-flexibility of providing the various prompting as assistance for students, and students can only access the prompt after accomplishment all of the questions. Therefore, a more appropriate platform should be developed or compared in future research, and a large-scale test is also advised to confirm the effectiveness of the graduated prompting DA instrument. Van der Graaf, J., Segers, E., & Verhoeven, L. (2015). Scientific reasoning abilities in kindergarten: dynamic assessment of the control of variables strategy. Instructional