Comparison of bird diet at two different nests

I have a list with the different prey types and quantities that birds at Nest $1$ gave to their fledglings. I also have the same information for another nest of the same bird species.

Suppose I have

NEST 1
Mammals: 500
Arthropods: 200
Birds: 20

NEST 2
Mammals: 180
Arthropods: 50
Birds: 3

Can I use the Chi-squared test to demonstrate the diets are actually very similar (or maybe not similar)?

Of course, my actual data is more complex, with more classes (to family level).


Solution 1:

Your idea to do a chi-squared test of homogeneity seems correct. I put the fictitious data of your question into a contingency table in R:

nest.1 = c(500, 200, 20)
nest.2 = c(100, 50, 3)
TAB = rbind(nest.1, nest.2)
TAB
       [,1] [,2] [,3]
nest.1  500  200   20
nest.2  100   50    3

Then R does the chi-squared test, as shown below. Or it is not difficult to do on a calculator.

chisq.test(TAB)

        Pearson's Chi-squared test

data:  TAB
X-squared = 1.6849, df = 2, p-value = 0.4307

Warning message:
In chisq.test(TAB) : 
 Chi-squared approximation may be incorrect

As you may know, the chi-squared test statistic is computed using the observed counts in TAB along with expected counts computed using the marginal totals of TAB according to the null hypothesis that the two nests have the same proportions of food categories. If any of the expected counts are smaller than $5$ the chi-squared statistic may not have the intended chi-squared distribution, and thus the P-value may be misleading.

For your data the the expected counts can be displayed as follows:

chisq.test(TAB)$exp
           [,1]      [,2]      [,3]
nest.1 494.8454 206.18557 18.969072
nest.2 105.1546  43.81443  4.030928
Warning message:
In chisq.test(TAB) : 
 Chi-squared approximation may be incorrect

Because only one of the expected counts is below $5$ and it is above $3$ the P-value $0.53 > 0.05 = 5\%$ is likely OK, indicating no evidence that the two nests differ as to relative frequencies of food categories. (Many statisticians would trust the P-value, if most counts are below $5,$ but none below $3.)$

However, as implemented in R, chisq.test can simulate a more accurate P-value when there is an error message. If you need to do this with your real data, the procedure is as below. (I would always do the simulation anytime there is a warning message, just to be sure. The simulation method in R is carefully implemented and I have found it to be trustworthy.)

chisq.test(TAB, sim=T)

        Pearson's Chi-squared test 
        with simulated p-value 
        (based on 2000 replicates)

data:  TAB
X-squared = 1.6849, df = NA, p-value = 0.4233

This may be necessary if you have more food categories (some of which may have low counts).

I am not saying that a chi-squared test of homogeneity is the only way to analyze your data, but it is s very commonly used method and will be understood by most people reading an account of your study.

Note: As such questions go, I think your project is very well explained (+1) and the best method of analysis is pretty obvious. However, in future. if you have questions that are more about statistical practice than about mathematical statistics, I would suggest posting on stats.stackexchange.com where experienced moderators and knowledgeable applied statisticians regularly comment on the validity of Answers.

Solution 2:

The test you might want is the chi-squared test for the rows of a contingency table to have the same distribution. In your case, the contingency table is $$ \begin{array}{c|c|c|c} &\text{mammals}&\text{arthropods}&\text{birds}&\text{total}\\ \hline \text{nest }1&500&200&20&720\\ \hline \text{nest }2&180&50&3&233\\ \hline \text{total}&680&250&23&953 \end{array} $$ The test is based on the assumption that distribution of the so called contingency chi-squared statistic is approximately a chi-squared distribution with $\ (r-1)(c-1)\ $ degrees of freedom, where $\ r\ $ is the number of rows and $\ c\ $ the number of columns in your contingency table (excluding those containing the row and column totals). This will be the case if the numbers in each row of the table are multinomially distributed, and the probability of an item going into column $\ j\ $ has the same value $\ q_j\ $ for all rows (which is your null hypothesis), where $\ q_j>0\ $ and $\ \sum_\limits{j}q_j=1\ $.

If the entry in row $\ i\ $ and column $\ j\ $ is $\ N_{ij}\ $, the column sum $\ \sum_\limits{k}N_{kj}\ $ of column $\ j\ $ divided by divided by the total $\ N_\text{tot}=\sum_\limits{k}\sum_\limits{l}N_{kl}\ $ of all entries is used as an estimate $$ \hat{q}_j=\frac{\sum_\limits{k}N_{kj}}{N_\text{tot}} $$ of $\ q_j\ $. If the null hypothesis is satisfied, the number of items you would expect to appear in the cell in row $\ i\ $ and column $\ j\ $ is $\ \hat{q}_j\ $ times the total number, $\ \sum_\limits{l}N_{il}\ $, of items in row $\ i\ $: $$ E_{ij}=\hat{q}_j\sum_\limits{l}N_{il}=\frac{\sum_\limits{l}N_{il}\sum_\limits{k}N_{kj}}{N_\text{tot}}\ , $$ and the chi-squared statistic is $$ \chi^2=\sum_i\sum_j\frac{\big(N_{ij}-E_{ij}\big)^2}{E_{ij}}\ . $$

If the null hypothesis is satisfied, then you expect the chi-squared statistic to have a value reasonably close to the mean of the appropriate chi-squared distribution, so you calculate the probability that a variable with that distribution would have a value larger than the one you got from your contingency table. This is commonly called the "$p\ $ value". If it's very small, that's taken as a disconfirmation of the null hypothesis (i.e. as an indication that it's unlikely to be satisfied). The value of $\ p\ $ below which you're going to consider the null hypothesis as disconfirmed is called the "significance level". What you choose it to be is up to you, though you need to keep in mind that the larger you make it, the more likely it is that you will be mistaken when you take the null hypothesis as being disconfirmed. The most common significance level used in biological, medical and social sciences appears to be $\ 0.05\ $.

For the contingency chi-squared test to be accurate, there must be a sufficiently large number in each cell of your contingency table. A typical recommendation is that the expected count in each cell should be at least $5$, though I have certainly seen authorities who maintain that this is way too conservative. For the data you've given, however, the expected counts do all exceed $5$, so there shouldn't be any problem applying the contingency chi-squared test to them. Plugging your numbers into this online contingency chi-squared calculator gives the value of the chi-squared statistic as $\ 5.803211596886\ $ and a $\ p$-value of $\ 0.054934934758\ $. This isn't significant at the $\ 0.05\ $ level, so those numbers don't provide much evidence of any dissimilarities in the two diets.

You also need to keep in mind that what you're really testing is the hypothesis that the chi-squared statistic follows the chi-squared distribution you've assumed it will when your null hypothesis is true, and that hypothesis might fail for reasons other than that the null hypothesis is false. It's conceivable, for instance, that the distributions of the numbers in the two rows are the same, but that the numbers themselves are correlated across the rows, and if that's the case, then the chi-squared distribution might not be a good approximation for that of the chi-squared statistic, even though your null hypothesis is true.

Answers to OP's questions in comments

I wouldn't attach any importance to the $\ p$-value by itself, unless it's very small. Ideally, you should decide on the significance level $\ \sigma\ $ you're going to use for a null-hypothesis significance test before you gather any data, and then only pay attention to whether $\ p<\sigma\ $ or $\ p\ge\sigma\ $. The jargon for reporting these results is typically "the null hypothesis can be rejected at the $\ \sigma\ $ level of significance" for the first, or "the null hypothesis cannot be rejected at the $\ \sigma\ $ level of significance" for the second.

If your null hypothesis is correct (as well as any subsidiary statistical assumptions needed for your test statistic to have its presumed distribution), then the significance is the probability, just by chance, that you will erroneously reject it (i.e. make a so-called type I error). Thus, while I do think $\ p<\sigma\ $ provides some evidence that your null hypothesis is false, nevertheless, if $\ \sigma=0.05\ $ and $\ p=0.043\ $, say, I wouldn't regard that evidence as being particularly strong. At best, I'd regard a rejection of the null hypothesis as being very provisional, pending further investigation. Also, if you're going to be carrying out lots of tests, you can expect about $\ \sigma T\ $ of them to have results that are significant at the $\ \sigma $ level, where $\ T\ $ is the total number of tests you do, so you should choose $\ \sigma\ $ to make $\ \sigma T\ll1\ $.

One of the big problems with null hypothesis significance tests is that if your null hypothesis is false, you have no idea of what the distribution of your test statistic is likely to be, and hence what the probability of your failing to reject the null hypothesis is (i.e. what the probability of making a type II error is) when it is false. Since I regard $\ p<\sigma\ $ as evidence against the null hypothesis when $\ \sigma\ $ is properly chosen, and the statistical assumptions needed to derive the distribution of the test statistic seem reasonable, then, as a matter of principle, I must also regard $\ p\ge\sigma\ $ as evidence in favour of it. Nevertheless, since I really have no idea of what the strength of that evidence might be, I would tend to disregard it completely. In your case, therefore, even if $\ p\ $ was as high as $0.45$, I'd limit myself to saying that the null hypothesis can't be rejected at the level of $\ \sigma\ $ (whatever $\ \sigma\ $ happens to be), and certainly wouldn't commit myself to saying that the diets are very similar, let alone the same, on the basis of this test alone.

There are probably statistical tests you could do that would give you a better chance of reaching firmer conclusions, but I wouldn't consider myself a fit and proper person to advise you on what they might be. If you want more authoritative opinions, I'd follow BruceET's advice to take your enquiry to stats.stackexchange.com.