（分享）样本量的计算方法

发布日期：2025-01-03 17:20 点击次数：131

前面小弟已经发了一个帖子，就是关于我导师收藏的一个运动营养网站，这是一个个人网站，里面关于统计的部分非常详细和丰富！看到重视的人不是很多，很是遗憾，这个网站真的是不错！给张网站的截图，点击图中标识的statistics，就会发现很多关于统计的知识，非常详细！看到版中经常有战友问关于样本量大小如何确定的问题。样本量的涉及要根据你的实验设计来确定的。在这儿小弟就关于样本量如何计算的部分贴出来供大家参考。请各位战友根据自己的实验设计，来选择合适的样本量计算的方法，呵呵不过所有都是英语的，估计大家英语水平，也难不到大家，呵呵WHAT DETERMINES SAMPLE SIZE?The traditional approach to estimation of sample size is based on statistical significance of your outcome measure. You have to specify the smallest effect you want to detect, the Type I and Type II error rates, and the design of the study. I present here new formulae for the resulting estimates of sample size. I also include new ways to adjust for validity and reliability, and I finish with sample sizes required for several complex cross-sectional designs. I also advocate a new approach to sample-size estimation based on width of the confidence interval of your outcome measure. In this new approach, your concern is with the precision of your estimate of the effect, not with the statistical significance of the effect. The formulae on these pages still apply, but you halve the sample sizes. -------------------------------------------------------------------------------- The Smallest Effect Worth DetectingI've already spent a whole page on magnitudes of effects. You should go back and make sure you understand it before proceeding. Or take a risk and read on! Let's look at a simple example of the smallest effect worth detecting. Your research project includes the question of differences in height of adults in two regions. This sounds like a trivial project, but hey, the difference might be caused by a nutritional deficit, environmental toxin, level of physical activity, or whatever. OK, what difference in height would you consider to be the smallest difference worth noticing or commenting on? Almost everyone reading this paragraph will automatically start thinking either in inches or centimeters. So what's your choice? An inch, or 2.5 cm? Sounds like a nice round figure! Let's go with it for now.To use my approach to sample-size estimation, you convert this difference into a value for the effect-size statistic. To do that, you divide it by the standard deviation, expressed in the same units. The standard deviation here is just the usual measure of spread, except that we have two groups. So let's assume we have an average of the standard deviation in both groups. Let's say it is 2 inches, or 5 cm. So, if you want to detect 2.5 cm, and the standard deviation is 5.0 cm, the smallest effect worth detecting is 2.5/5.0, or 0.5.I'll talk about what I mean by detecting in a minute. First, more about the smallest effect. You'll discover shortly that the required number of subjects is quite sensitive to the magnitude of the smallest worthwhile effect. In fact, halving the magnitude quadruples the number of subjects required to detect it. So the way you decide on the smallest effect is important. How did we arrive at that minimum difference of 2.5 cm? In my experience, most researchers dream up a number that sounds plausible, just like we did here. Well, sorry, but you just can't do it like that. In fact, you don't have the freedom to choose the minimum effect. In all but a few special cases, it's the threshold for small effects on the scale of magnitudes: 0.2 for the Cohen effect-size statistic, 10% for a frequency difference, and 0.1 for a correlation. You need the same sample size to detect each of these effects, and as we'll see, it's 800 subjects for a simple cross-sectional study in the old-fashioned way of doing the figuring. It's even more than 800 when you factor in the validity of your variables. But don't panic. We'll also see that there are ways of reducing this number, sometimes drastically. -------------------------------------------------------------------------------- Type I and II Error RatesNow, what do I mean by detecting? Simply that if the real difference between the two groups in the population is 2.5 cm (an effect size of 0.5), you want to be sure that it will turn up as statistically significant in the sample that you draw for your study. If it doesn't turn up as statistically significant, you have failed to detect something that you were interested in. Make sense? So our definition of statistical significance, and our idea of what it means to be sure that it will turn up, both impact on the required sample size. First, statistical significance. The difference is statistically significant, by definition, if the 95% confidence interval does not overlap zero, or if the p value for the effect is less than 0.05. Values of 95% or 0.05 are also equivalent to a Type I error rate of 5%: in other words, the rate of false alarms in the absence of any population effect will be 5%. We don't have any choice here. It has to be 5%, or less preferably, but most researchers opt for 5%. If you want a lower rate of false alarms, say 1%, you will need more subjects.Now, what about being sure that the effect will turn up? In other words, if the effect really is 2.5 cm in the populations, how sure do we want to be that the difference observed in our sample will be statistically significant? We don't have any choice here, either. We have to be at least 80% sure of detecting the smallest effect. To put it another way, the power of the study to detect the smallest effect has to be at least 80%. Or to put it yet one more way, the Type II error rate--the rate of failed alarms for the smallest effect--is set at 20% or less. That's one chance in five of missing the thing you're looking for!?! Sounds a bit high, but keep in mind that it is the rate for the smallest worthwhile effect. The chance of missing larger effects is smaller. Once again, if you want to make the error rate lower, say 10%, you will need more subjects. -------------------------------------------------------------------------------- Research DesignWe're stuck with having to detect 0.2 for the effect-size statistic, 10% for a frequency difference, or 0.1 for a correlation. And we're stuck with false and failed alarms of 5% and 20%. All that's left now is how we're going to go about it: the research design. When it comes to sample sizes, there are only two sorts of research design: cross-sectional and longitudinal. Cross-sectional designs include correlational, case-control, and any other design with single observations for each subject. Some so-called prospective designs, where subjects are followed up over time, are cross-sectional if there is only one value for each variable for each subject. Cross-sectional studies need heaps of subjects, and the number is affected by the validity of the variables.Longitudinal designs include time series, experiments, controlled trials, crossovers, and anything else where the dependent variable is measured twice or more. The data have to be subjected to repeated-measures analysis. The usual thing with these designs is a measurement before and after you do something, to see if what you do has any effect. Whether or not you have a control group, it is always the case that subjects "act as their own controls", because there are always pre and post measurements on the subjects. Longitudinal designs generally need far fewer subjects than cross-sectional designs, depending on the reliability of dependent variable. （缩略图，点击图片链接看原图）Sample Size for Cross-Sectional Studies For variables with perfect validity, you can now look up tables or run special software to see how many subjects you need. (G*power is a great little free program for the purpose.) Or use the following simple formula I have worked out:For Type I and II errors of 5% and 20%, the total number of subjects N is given by:N = 32/ES2, where ES is the smallest effect size worth detecting.Example: for ES = 0.2, the total N is 800, which means 400 in each group for a case-control study or a study comparing males and females. So for our study of differences in height, we'd need 400 in each group. What about if the outcome is a difference in the frequency of something in the two groups, for example the frequency of clinical obesity. The minimum worthwhile difference is 10% (e.g. 25% in one group and 35% in the other). You just think about that difference as being equivalent to an effect size of 0.2, and plug it into the formula: 400 in each group again.And finally what about sample size to detect a correlation, for example the correlation between physical activity and body fat? Same story: 800 subjects to detect the minimum worthwhile correlation of 0.1, because a correlation of 0.1 is equivalent to an effect size of 0.2. For larger correlations use the scale of magnitudes to convert the correlation to an equivalent effect size, then plug it into the formula. For the rare cases where you have the luxury of Type I and II errors of 1% and 10% respectively, the number is nearly double: N = 60/ES2.Validity of the variables can have a major impact on sample size in cross-sectional studies. The lower the validity, the more the "noise in the signal", so the more subjects you need to detect the signal. If the validity correlation of the dependent variable is v (Pearson, intraclass, or kappa), the number of subjects increases to N/v2.To detect a correlation between variables with validities v and w, the number is N/(v2w2). Sample sizes may therefore have to be doubled or quadrupled when effects are represented by psychometric or other variables that have modest (~0.7) validity.Sample Size for Longitudinal Studies In our first example on this page, we had a cross-sectional design in which we were interested in the difference in height between people in two regions. Now, in a longitudinal design, we might want to know whether a stretching exercise makes people taller. Can you see that the same concept of minimum effect size still holds here? If we thought one inch was the smallest difference worth detecting between groups, then it has to be the smallest difference we would like to see as a result of our stretching exercise. (It might need a medieval rack to make people a whole inch taller!)Once again we don't have a choice about that minimum effect: it's still an effect size of 0.2 standard deviations, and the standard deviation is still the usual standard deviation of the subjects. At the moment we have only one group of subjects, and the standard deviation before we put people on the rack is usually about the same as after the rack. So you can think about the minimum effect size as a fraction of either standard deviation. But note well: do not use the standard deviation of the before-after difference score.Reliability of the dependent variable is the final piece of the jigsaw. The higher the reliability, the more reproducible are the values for each subject when you retest them, which makes it more likely you will detect a change in their values. So the higher the reliability, the less subjects you need to detect the minimum effect. Read the earlier section on sample size for an experiment for an overview of the role of typical error in sample-size estimation, and for an important detail about the conditions in a reliability study aimed at estimating sample size. The rest of this section contains details of formulae that you may not need to worry about. You can use two forms of reliability in the formulae: retest correlation and within-subject variation.Using the Retest Correlation First, a couple of cautions. The retest correlation is for retests with the same time between the tests as you intend to have in your experiment. For example, if you are doing an intervention that lasts 2 months, you need a 2-month retest correlation. Don't use a 1-day retest correlation unless you have good grounds for believing that it will be the same as a 2-month retest correlation. Also, the spread between the subjects in your study has to be similar to the spread between the subjects in the reliability study. If the spread is different, the value of the retest correlation coefficient will be inappropriate. In that case you will need to calculate the appropriate value by combining the within and between (S) standard deviations for your subjects using this formula: retest correlation r = (S2-s2)/S2.Right, here's the strategy for working out the required sample size when you know the retest correlation:Work out the sample size of an equivalent cross-sectional study, N, as shown above. It's 800 in the traditional approach using statistical significance, or 400 using my new approach of adequate precision of estimation for trivial effects. Determine the reliability r of the outcome measure by consulting the literature or doing a separate study. For a simple design consisting of a single pre and post measurement on each subject, and no control group, the number of subjects is: n = (1 - r)N/2This formula applies also to simple crossover designs, in which subjects receive an experimental treatment and a control treatment. (One half get the experimental treatment first; the other half get the control treatment first.) If there is a control group, the total number of subjects required is: n = 2(1 - r)NYes, you need four times the number of subjects when there is a control group, not twice the number. Hard to accept, I know. To take into account the validity of the outcome measure, multiply the above formulae by 1/v2, where v is the concurrent validity correlation (the correlation between the observed value and the true value of the variable). The simplest estimate of the concurrent validity is the square root of the concurrent reliability correlation for the outcome measure, so you simply divide the above formulae by the concurrent reliability correlation. In general, the concurrent reliability will be greater than the retest reliabilityUsing the Within-Subject Variation You can also think about the difference between the post and pre means in terms of the within-subject variation (standard deviation). For example, if the performance of an individual athlete varies by 1% (the within-subject standard deviation expressed as a coefficient of variation), how many athletes should you test to detect a 1% change in performance, or a 2% change, or a 0.5% change? Here is the formula:To detect a fraction of a within-subject standard deviation with 5% false alarms and 20% failed alarms: n = 64/f2 with a full control group n = 16/f2 for crossovers or experiments without a control group. Another way to represent the same formulae is to replace f with d/s, where d is the smallest worthwhile post-pre difference you want to detect, and s is the within-subject standard deviation: n = 64s2/d2 with a full control group n = 16s2/d2 for crossovers or experiments without a control group. Remember to halve these numbers when you justify sample size using the new approach based on acceptable precision of the outcome. Example: You want to detect (p=0.05, 80% power) a 2% change in performance when the coefficient of variation is 2%. The corresponding value of f is 1.0, which means you'd need to test 16 athletes in a crossover design, or 32 in each of a control and experimental group. Or it's 8 or 16+16, if you justify sample size using precision of estimation.What's the smallest value of f worth detecting? Is it 1.0? Not an easy question! To answer it, you usually have to bring in the between-subject variation one way or another. Why? Because you can't get away from the fact that the magnitude of a change in the value of a variable usually has to be thought about in terms of the variation in the values of that variable between subjects. That's what minimum worthwhile effect sizes are all about. For example, if the between-subject variation is 5%, the smallest difference worth detecting is 0.2*5% or 1%. So, if your within-subject variation of 2%, you have to chase an f of 0.5. But if the between-subject variation is 10%, the smallest worthwhile effect is 0.2*10% or 2%, so you chase an f of 1.0.Once you bring the between-subject variation back into the picture, you have all the ingredients for expressing the reliability as a retest correlation, so you can use the formulae with the retest correlation. For example, a within of 2% and a between of 5% implies a retest correlation of (52-22)/52 or (25-4)/25 or 0.84. A within of 2% and a between of 10% implies a correlation of (100-4)/100, or 0.96. Use these correlations in the formulae for sample size and you'll get the same answers as in the formulae using f. But if you have a reasonable notion of the smallest worthwhile change in a variable without explicitly knowing the between-subject standard deviation or the correlation, use the formula with d and s (or f).There is certainly one situation where it's better to use the within-subject variation: estimation of sample size in studies of athletic performance. When athletes are subjects and competitive performance is the outcome, the smallest worthwhile effect is an enhancement that increases the medal prospects of a top athlete, not the average athlete. For sports like track and field, this minimum effect is about 0.5 of the typical variation in a top athlete's performance between events. For example, if the typical variation between events is 1.0%, then you're interested in enhancements of about 0.5%. So if you use a lab test with the same typical error as the competitive event, f in the above formulae is simply 0.5, so you would need 64/0.52, or 256 subjects for a fully controlled study. That's bad enough, but if your lab test has a typical variation of 2.0%, f is 0.5/2.0, which means 1024 subjects! Oh no! Clearly you need very reliable lab tests if you want to detect the smallest effects that matter to top athletes. See this Sportscience article for more information:Hopkins WG, Hawley JA, Burke LM (1999). Researching worthwhile performance enhancements. Sportscience 3, sportsci.org/jour/9901/wghnews.htmlSample Size for Complex Cross-Sectional Studies I'll deal with two groups of unequal size, more than two groups, and more than one independent variable. Anything else requires simulation.Two Groups of Unequal Size Up to this point I have assumed equal numbers in each group, because that gives the most power to detect a difference between the groups. But sometimes unequal numbers are justified.The simplest case is where you have far more in one group than another. For example, you already have the heights for thousands of control subjects from all over the country, and you want to compare these with the heights of people from a particular region you are interested in. So, how many subjects do you need in that particular group? And the answer is... as few as one-quarter the usual number! But you will need to test, or have the data for, an "infinite" number of subjects in the other group for the number to be that low. How big is infinite? For the purposes of statistical power, about 5 times as many as in the special-interest group is close enough.I have a formula, but to understand how to apply it will need a lot of thought. If you have samples of size n1 and n 2, then your study will have the power equivalent to a study with a sample size of N equally divided between two groups, where:N = 4 n1 n2/( n1 + n2)For example, if you have data for 1000 controls (= n1), and 800 (= N) is the number you would normally require for equal-sized groups, then the above formula shows that you need to test only 250 cases (= n2). If you make n1 very large, the formula simplifies to N = 4 n2, or n2 = N/4, which is one-quarter the usual total number. More Than Two Groups Suppose we wanted to compare the heights of people in more than two regions. What should we do about the sample size? Do we need more than 400 in each region, less than 400, or just 400? And the answer is... it depends on what estimates or contrasts you want to perform.If you are interested in comparing one particular region with another particular region, you will still need 400 in each of those regions to keep the same power to detect a difference. The fact that you have all those other regions in the analysis matters not a jot, I'm afraid. They don't increase the power of the design unless the number in each region is about 10 or less, which it never should be!If you are interested in comparing one particular region with the mean of every other, you've got the usual two-group design, but with 400 subjects in the region of interest and 400 divided up equally into the other regions.If you want to do every possible comparison between pairs of regions, or between pairs of groups of regions, things start to get complicated. As far as I can see, with six regions, say, only five completely independent comparisons are possible. So if you are concerned about inflation of the Type I error, you will need to apply Bonferroni's correction by reducing the p value to 0.05/5, or 0.01. Alas, a smaller p value means a bigger sample size. It's difficult to work out exactly what it should go up to, because somehow or other the inflated Type II error should also be taken into account. Certainly, nearly doubling the group size from the usual 400 would be a good start in this example, because as we've already seen on this page, that would be equivalent to a p value of 0.01 and a Type II error of 10%, instead of the usual 0.05 and 20%. More Than One Independent Variable Suppose you intend to measure half a dozen things like age, sex, body fat, whatever, and you want to know the effect of each of them on severity of injury in a particular sport. How many subjects do you need?Before we get clever with complex models for this question, let's take in the big view. If we treat each variable as a separate issue, it should be obvious that there will be a problem with inflation of the Type I error: none of the variables you've measured might predict severity of injury in the population, but if you have enough variables, there's a good chance one will predict injury in your sample. So you'll need to reduce your p value using Bonferroni's 0.05/n, where n is the number of independent variables. This correction will be too severe if the independent variables are correlated, but I don't know how to adjust for that.When you analyze the data, you should look at the effect of the independent variables separately to start with, but you will also end up using multiple linear regression, analysis of covariance, or some other complex model, with all the independent variables on the right-hand side of the model. As I explained on the first page devoted to complex models, you are now asking a question about how much each variable contributes to the severity of injury in the presence of (when you control for) the others. How many subjects do you need to answer this question? Theoretically the extra independent variables shouldn't make much difference, but I've checked by simulation to make sure. You need one extra subject for each extra independent variable. With five extra variables, that makes five extra subjects. Forget it. With a thousand or so subjects, five won't make any difference.Here's a different problem involving more than one independent variable, where you don't have to worry about increasing the sample size to reduce the Type I error. Suppose you are currently predicting competitive performance from four lab and field tests, and you want to know whether it's worth adding an expensive fifth test to the test battery. For this sort of problem, you would model the data by doing a multiple linear regression, with the expensive test as the last independent variable in the model. So, how many subjects? It's a specific extra variable in this case, so there is no inflation of the Type I error, so the sample size is still about 800. But if all the field tests were in there on an equal footing, and you wanted to know which ones to drop out of the test battery, then it's back to the bigger sample size of the previous example. In this case you'd use stepwise regression with a reduced p value for entry of variables into the model.为了方便大家寻找，下面列举了这个网站统计方面所有的统计，都是按照字母排列的，一下就能看到你要的信息，呵呵New View of Stats: HomeSportscience: Home --------------------------------------------------------------------------------About These PagesAlpha ReliabilityANCOVA (Analysis of Covariance)ANOVA (Analysis of Variance) One·Way Two·Way Repeated MeasuresArcsine-root TransformationAssessing an Individual (using reliability) Bayesian analysisBecome a license holderBetween-subject variationBiasBinomial regressionBonferroni adjustmentBootstrapping Calibration equationCategorical modelingCentral limit theoremChi-squared test Clinical significance Cluster analysisCoefficient of variation Defined Measure of reliability from log-transformed data Use of ± and ×/÷Complex ModelsConfidence limits/Interval Calculations: assumptions of Correlation Defined of Effect Size (long, pop SD) of Effect Size (long, sample SD) of Effect Size (x·sect) Example Effect of Sample Size of Frequency Difference of Goodness·of·Fit from a P Value Powerpoint Presentation Spreadsheet Statistical SignificanceContingency TableContrasts: see EstimatesControlled trialConfoundersControlling for somethingControlling Type I ErrorCorrelation CoefficientCounts as dependent variableCovariance DefinedCovariates in repeated measuresCrossovers Simple MultipleCumulative Type I and 0 error Data and variablesDegrees of freedomDimension reductionDiscriminant function analysisDistributions Binomial Frequency Normal Poisson Probability Effect statistics Correlation coefficient Defined Difference in means Effect-size statistic and non-parametric analysis Magnitudes Probability of superiority Relative frequencyError Non-uniform error Precision of estimation Precision of measurement RMSE (root mean square error) Type 0 Type I Typical (standard) error of estimate Typical (standard) error of measurement Type IIEstimates or Contrasts in 1-Way Anova in RM·ANOVAEstimating Sample Size See Sample·Size Estimation Factor AnalysisFisher z Transformation Bootstrapping Confidence Limits Confidence Limits of rFitting Models Non·linear PolynomialsFixed vs random effectsFrequency Distributions Generalizing to a PopulationGeneralizing via Confidence LimitsGetting It WrongGoodness of FitGroup·Sequential Design see Sample·Size On The Fly HeteroscedasticityHome PageHow a Stats Program Fits a ModelHow Many Digits?Hypothesis Testing IndependenceIndividual responses in slideshow on repeated measures via Mixed Modeling via Analysis of ReliabilityInflation of Type 0 and I ErrorInteractions Kappa Coefficient Goodness of Fit Measure of Reliability Latin SquaresLeast·Squares MeansLikert scales: analysisLikert scales: ordinal variablesLimits of AgreementLinear and Non·Linear ModelsLinear Regression Goodness of Fit SlopeLISREL Log TransformationLogistic Regression Magnitudes for Effect StatisticsMANOVA (Multivariate ANOVA)Mean ± SD or Mean ± SEM?Mechanism variables with repeated measuresMeta·AnalysisMiddle of DataMixed ModelingModeling CovariancesModels and Tests Bad Residuals Calculating confidence limits Categorical Modeling Chi-squared test Complex Models Details Fitting a Model Goodness of Fit Linear Regression Mixed Modeling Mixed Modeling: Proc Mixed >1 Independent Variable >1 Dependent Variable Non·Parametric One vs two·tailed Paired T Test Parameters Repeated Measures Residuals Simple Models Unpaired T Test and One·Way ANOVA Variables of Uncertain Status Within-Subject ModelingMultiple Linear Regression New-prediction errorNon·linear and linear modelsNon-uniform errorNon·parametric modelsNormality of sampling distributionNormality of residualsNull value in sample-size estimationNumber of Subjects OddsOdds ratioOne-way ANOVAOrdinal dependent variables P Values and Confidence Intervals Statistical Significance P<0.05? Using themPaired T TestParametersPath Analysis Percentile RangesPlaceboPolynomial ContrastsPolynomial RegressionPolynomials: Non·linear ModelsPopulationPost-hoc vs Pre-planned TestsPowerPowerpoint Presentations Analysis of Repeated Measures Confidence Limits Covariates in Repeated Measures Reliability Statistical vs Clinical SignificancePrecision of estimationPrecision of Individual's Value via Reliability via ValidityPrecision of MeasurementPredicting with a linePredicted ValuesPre-planned vs Post-hoc TestsPRESS statisticPrincipal ComponentsProbabilityProbability of SuperiorityProc MixedProportions as Dependent Variable Quiz Random vs Fixed EffectsRank TransformationRegression to the meanRelative FrequencyReliability Applications Assessing an Individual Comparing Rely of Measures Individual Responses Sample Size for an Experiment Biased Estimates Biological Variation Calculations Change in the Mean Coefficient of Variation Concurrent Correlation Defined Design of Rely Studies Limits of agreement Modeling Variances Nominal Variables Powerpoint Presentation Retest Correlation Sample Size for Rely Studies Spreadsheet Standard Error of Measurement Technological Error Total Error Typical Error of Measurement Within·Subject VariationRepeated·Measures Article on controlled trials and crossovers Covariates Crossovers Polynomial Contrasts Two trials no bet·subj effect Two trials + bet·subj effect >Two trials no bet·subj effect >Two trials + bet·subj effect Mixed Modeling Other Spreadsheets for analysis Troublesome variables Using Proc Mixed Within-Subject ModelingResamplingResearch Designs Cross·Sectional vs Longitudinal Sample Size Cross·Sectional Sample Size LongitudinalResidualsResiduals: uniformity Residuals: BadRMSE (Root Mean Square Error)R·squared: see Variance explained Sample·Size Estimation Based on Reliability Study Based on Confidence Limits Complex Cross·Sectional Studies Cross·Sectional Studies Formulae Longitudinal Designs "On The Fly" Correlations Ethics Frequencies Means Other Designs Statistical Significance Overview for Reliability Studies by Simulation Traditional ApproachSampling distributionScale of MagnitudesSD vsSEMSearch formSEESensitivitySimple Models and TestsSimple StatisticsSimulation for Sample Size Cross·Sectional Study Longitudinal StudySlide shows See Powerpoint PresentationsSolution for a ModelSpecificitySpread of dataSpreadsheets for Analyzing repeated-measures trials Assessing an individual Confidence limits and clinical significance Reliability ValidityStandard deviationStandard deviaiton vs standard error of the meanStandard error of estimate Measure of ValidityStandard error of measurementStandardized regression coefficientStatistical significance Confidence limits P values vs clinical significanceStatistics Simple Effect TestStepwise regression Sample sizeStructural equation modelingSummarizing dataSummary: most important points T test and one·way ANOVAT test with unequal variancesT test: article and spreadsheetsTest statisticsTests: see Models and testsTransformation Arcsine-root Defined Fisher z Log Rank Square rootTwo-way ANOVAType 0 errorType I errorType II errorTypical error of estimateTypical Error of Measurement Uniformity of residuals in complex models and transformations Unpaired t testUnpaired ttest with unequal variances Validity Applications Assessing an individual Comparing validity of measures Sample size for Xsect study Sample size for validity study Validity for monitoring changes Correlation Defined Estimation equation Nominal variables PRESS statistic Spreadsheet Square root of reliability Surrogate variables Systematic offset Typical error of estimateVariables Basics Dependent vs predictorVariance explained and correlation coefficient Goodness of fit Scale of effect magnitudesWithin-subject modelingWithin-subject variation Measure of reliability --------------------------------------------------------------------------------New View of Stats: HomeSportscience: Home看到还有人问样本量的问题，顶一下，分享一下，呵呵不错的网站，谢谢分享

上一篇：天智航(688277.SH)拟向参股企业星空放疗公司增资
下一篇：比亚迪最新款海豹_易车知识库