Wilcoxon Signed Rank Test | विल्कॉक्सन साइन रैंक परीक्षण

विल्कॉक्सन साइन रैंक परीक्षण (Wilcoxon Signed Rank Test)

परिचय

विल्कॉक्सन साइन रैंक परीक्षण (Wilcoxon Signed Rank Test) एक लोकप्रिय नॉन-पैरामीट्रिक परीक्षण है जो दो संबंधित नमूनों (Paired Samples) या किसी एक नमूने के माध्यक (Median) की तुलना के लिए उपयोग किया जाता है। यह परीक्षण तब उपयोगी होता है जब डेटा सामान्य वितरण का पालन नहीं करता या जब मापन केवल क्रमबद्ध (Ordinal) रूप में उपलब्ध हो।

यह परीक्षण साइन टेस्ट का उन्नत रूप है। जहाँ साइन टेस्ट केवल दिशा (+ या –) पर विचार करता है, वहीं विल्कॉक्सन परीक्षण अंतर के परिमाण (Magnitude) और दिशा दोनों पर ध्यान देता है। इससे यह परीक्षण अधिक संवेदनशील और सटीक हो जाता है।

परीक्षण का उद्देश्य

यह जाँचना कि क्या दो संबंधित नमूनों के बीच का माध्यक अंतर शून्य है या नहीं।
जब डेटा पैरामीट्रिक परीक्षण (जैसे t-test) के लिए उपयुक्त न हो।
जब डेटा रैंक-आधारित या नॉन-नॉर्मल हो।

परिकल्पना का निर्माण

शून्य परिकल्पना (H₀): माध्यक अंतर = 0 (कोई महत्वपूर्ण अंतर नहीं)।
वैकल्पिक परिकल्पना (H₁): माध्यक अंतर ≠ 0 (महत्वपूर्ण अंतर मौजूद)।

विल्कॉक्सन साइन रैंक परीक्षण की प्रक्रिया

प्रत्येक युग्म (Pair) के बीच अंतर (Difference) निकालें: D_i = X_2i – X_1i
जिन अंतर का मान 0 है, उन्हें छोड़ दें।
शेष अंतरों के परिमाण |D_i| का उपयोग करें।
इन परिमाणों को आरोही क्रम में व्यवस्थित करें और रैंक (Rank) प्रदान करें।
प्रत्येक रैंक को उसके वास्तविक साइन (+ या –) के साथ जोड़ें।
सकारात्मक रैंक का योग (T⁺) और नकारात्मक रैंक का योग (T^–) निकालें।
परीक्षण सांख्यिकी (Test Statistic) T = छोटे योग का मान।

उदाहरण

मान लीजिए कि एक कंपनी ने अपने कर्मचारियों की नई ट्रेनिंग के बाद प्रदर्शन स्कोर मापा:

कर्मचारी	पहले स्कोर (X₁)	बाद का स्कोर (X₂)	अंतर (D)	\|D\|	रैंक	साइन
1	60	65	+5	5	3	+
2	58	60	+2	2	1	+
3	55	52	–3	3	2	–
4	62	68	+6	6	4	+
5	57	55	–2	2	1	–

अब रैंक योग:

T⁺ = 3 + 1 + 4 = 8
T^– = 2 + 1 = 3

T = छोटे योग = 3

यदि α = 0.05 और n = 5 के लिए Wilcoxon तालिका मान 0 या 2 है, तो चूंकि 3 > तालिका मान, हम H₀ को अस्वीकार नहीं करते। अर्थात ट्रेनिंग से प्रदर्शन में महत्वपूर्ण सुधार नहीं हुआ।

परीक्षण का गणितीय सिद्धांत

यदि नमूना आकार छोटा है (n ≤ 20), तो तालिका से critical values ली जाती हैं। यदि नमूना बड़ा है, तो सामान्य निकटता (Normal Approximation) का उपयोग किया जा सकता है:

Z = (T – n(n + 1)/4) / √[n(n + 1)(2n + 1)/24]

डेटा साइंस में उपयोग

मॉडल में सुधार से पहले और बाद के प्रदर्शन की तुलना।
क्लिनिकल ट्रायल में उपचार से पहले और बाद के परिणामों की जांच।
विभिन्न मशीन लर्निंग मॉडलों की median error तुलना।
ग्राहक सर्वे में परिवर्तन के प्रभाव की जांच।

लाभ

साधारण और विश्वसनीय।
नॉर्मलिटी की आवश्यकता नहीं।
छोटे नमूनों के लिए उपयुक्त।
Magnitude और direction दोनों को ध्यान में रखता है।

सीमाएँ

बड़े डेटा सेट्स पर गणना समय-साध्य।
Zero difference वाले डेटा को नज़रअंदाज़ करना पड़ता है।
केवल संबंधित नमूनों पर लागू।

निष्कर्ष

विल्कॉक्सन साइन रैंक परीक्षण डेटा साइंस में उन परिस्थितियों के लिए अत्यंत उपयोगी है जहाँ पारंपरिक पैरामीट्रिक परीक्षण (जैसे t-test) उपयुक्त नहीं होते। यह न केवल दिशा बल्कि अंतर की तीव्रता को भी ध्यान में रखता है, जिससे यह अधिक सटीक और मजबूत विश्लेषण प्रदान करता है। मशीन लर्निंग मॉडल मूल्यांकन, हेल्थकेयर डेटा विश्लेषण और प्रयोगात्मक शोध में इसका व्यापक उपयोग होता है।

Wilcoxon Signed Rank Test

Introduction

The Wilcoxon Signed Rank Test is one of the most powerful non-parametric tests for analyzing paired or related samples. It serves as an alternative to the paired t-test when the normality assumption is not met or when data is ordinal. Unlike the Sign Test, which considers only the direction of differences, the Wilcoxon Test also accounts for their magnitudes, making it more sensitive and accurate.

Objectives

To determine whether the median difference between paired observations is zero.
To test the effect of a treatment or change in related samples.
To handle non-normal or rank-based data effectively.

Hypotheses

Null Hypothesis (H₀): Median difference = 0 (no significant change).
Alternative Hypothesis (H₁): Median difference ≠ 0 (significant change).

Procedure

Calculate the difference between each paired observation (D = X₂ – X₁).
Ignore zero differences.
Take the absolute value |D| and rank them in ascending order.
Assign the original sign (+/–) to each rank.
Sum the positive (T⁺) and negative (T⁻) ranks separately.
The smaller of the two sums (T) is the test statistic.
Compare T with the critical value from the Wilcoxon table.

Example

Suppose a data scientist wants to test whether a new model improves prediction accuracy compared to an old model. Five datasets were tested, and errors before and after were recorded.

Dataset	Old Model Error	New Model Error	Difference (D)	\|D\|	Rank	Sign
1	12	10	–2	2	2	–
2	15	13	–2	2	2	–
3	10	9	–1	1	1	–
4	18	20	+2	2	2	+
5	14	13	–1	1	1	–

T⁺ = 2, T⁻ = 6 → T = 2

Compare T with Wilcoxon critical table (n = 5, α = 0.05 → T_critical = 0 or 1). Since 2 > 1, fail to reject H₀ → no significant improvement in accuracy.

Mathematical Expression

For larger samples (n > 20), approximate using the normal distribution:

Z = (T – n(n + 1)/4) / √[n(n + 1)(2n + 1)/24]

Applications in Data Science

Comparing performance of models before and after optimization.
Testing the effect of algorithm tuning on prediction errors.
Used in A/B testing when results are ordinal or skewed.
Applied in clinical and psychological studies for paired comparisons.

Advantages

Considers both direction and magnitude of differences.
Does not assume normality or equal variances.
Effective even for small samples.
Robust against outliers and non-normal data.

Limitations

Computation becomes complex for large samples.
Zero differences must be excluded.
Only suitable for paired or matched samples.

Conclusion

The Wilcoxon Signed Rank Test is a cornerstone of non-parametric inference, bridging the gap between simple tests like the Sign Test and complex parametric approaches. It provides a reliable, distribution-free way to evaluate paired differences, making it indispensable in modern data science, especially for model evaluation, algorithm comparison, and experimental validation where data normality cannot be assumed.

Data Frame in R | R में डेटा फ़्रेम

R में डेटा फ़्रेम (Data Frame in R) परिचय R प्रोग...

Linear Model in R | R में रैखिक मॉडल

R में रैखिक मॉडल (Linear Model in R) परिचय R प्रोग�...

Simulation in R | R में सिमुलेशन

R में सिमुलेशन (Simulation in R) परिचय सिमुलेश...

Working with and Manipulating Data in R | R में डेटा पर कार्य करना और उसे संशोधित करना

R में डेटा पर कार्य करना और उसे संशोधित करना (Wo...

Writing Data in R | R में डेटा लिखना

R में डेटा लिखना (Writing Data in R) परिचय डेटा व�...

Wilcoxon Signed Rank Test | विल्कॉक्सन साइन रैंक परीक्षण