Attributes Variables and types of data

Attributes, Variables, and Types of Data

Attributes, Variables, and Types of Data

While studying any phenomenon we come across two types of characteristics: (i) constant and (ii) variable.

The characteristic which does not change its value (or nature) is considered as constant.

For example: Height of a person after 25 years of age, altitude of a certain place from sea level, etc.

On the other hand, there are many characteristics which are qualitative or quantitative in nature and change their values (or nature).

For example: Examination result of a candidate can be recorded as pass or fail which is a qualitative variable characteristic, whereas we can express a candidate's performance as percentage of marks which is a quantitative variable.

Types of Characteristics:

Attributes and Variables

Statistics involves the study of variable characteristics. Hence, we include the related and necessary definitions.

Attribute: A qualitative characteristic like sex, nationality, religion, grade in examination, blood group, beauty, defectiveness of an article produced by a machine is called an attribute.

There are four types of scales of measurements viz. the nominal, ordinal, interval and ratio scales.

Attributes are measured using nominal and ordinal scales.

Nominal Scale:

Nominal scale consists of two or more named categories into which the objects are classified.

For example:

  • Classification of individuals using blood groups constitutes a nominal scale.
  • Classification of students in various divisions of the same standard also represents a nominal scale.
  • Classification of individuals using sex, caste, nationality, etc. also uses a nominal scale.
  • House numbers, survey numbers, pincode numbers are also examples of nominal scale.

Remarks:

(i) In nominal scale if numbers are used, then those are allotted in a purely arbitrary manner. Those numbers are just for identification purposes used in place of labels.

(ii) Those numbers are interchangeable.

Ordinal Scale:

Ordinal scale of measurement gives numbers to groups of objects using some quantifiable characteristics. Therefore ordered arrangement of groups is possible in this type of scale.

For example:

  • Groups of individuals according to income such as poor, middle class, rich.
  • Groups of students according to grades in examination, such as fail, second class, first class, first class with distinction.
  • Groups using weight such as light, heavy.
  • Groups using height such as short, medium, tall. Similarly groups of individuals as dull or intelligent; groups of objects as soft or hard etc. are all situations where ordinal scale can be used.

Remarks:

(i) In the ordinal scale, numbers given to groups as labels, serve the purpose of ranks. Hence those labels are not interchangeable.

(ii) In the ordinal scale, the groups are ordered according to some characteristic. Suppose three individuals A, B, C are given ranks 1, 2, 3 respectively according to their height. A is the shortest and C is the tallest. In this case heights of A, B, C may not be equispaced, however, they possess equispaced ranks.

(iii) Rank of individual B is 2, however, height of B is not double the height of A or height of C is not three times the height of A. Here we note that the heights of B, C are not exact integer multiples of height of A, however, they possess ranks 1, 2, 3 which are integer multiples of rank of A. Here we only mean that B is taller than A and C is taller than B.

Nominal and ordinal scales are used in the measurement of attributes.

Variable:

A quantitative characteristic (which changes its value) like weight of person, examination marks, population of a country, profit of a salesman, is called a variable.

Note: Variables are measured using interval scale and ratio scale. The drawback of the ordinal scale that the units are not equispaced is overcome in the interval scale.

Interval Scale:

Interval scale of measurement has equal units of measurement, however, the zero point is arbitrary.

The classic example of interval scale in our day-to-day life is the Centigrade or Fahrenheit scale of temperature measurement. In both the scales zero is arbitrary, it does not mean absence of heat. Moreover 60°C does not contain exactly double the heat that 30°C has. However, the difference in temperature between 10°C to 20°C is the same as that between 50°C to 60°C (or that between similar pair).

Drawback:

In interval scale 0 is arbitrary, it is chosen as per convenience, therefore we can add (or subtract) a constant in the readings on interval scale without affecting the form of scale. However, we cannot multiply or divide the readings by a constant.

Use:

In spite of the drawback in interval scale, it is used for convenience in behavioral sciences to study mental and social variables and traits.

All the drawbacks existing in the earlier three scales of measurement viz. nominal, ordinal, interval scales are overcome in the ratio scale. It is the best scale of measurement. It is used in almost all places.

Ratio Scale:

Ratio scale of measurement has equal units of measurement and those are taken from a true zero.

All the measurements of type height (cm), weight (kg), time (hours) etc. are the examples of ratio scales. In this scale 60 kg weight is exactly double heavy as compared to 30 kg weight.

It can be clearly noticed that variables can be measured by numbers. Further, the variables can be divided into two categories: (i) discrete and (ii) continuous.

Definition:

A variable taking only particular values is called a discrete variable.

For example: Number of students in a class, number of articles produced by a machine, population of a country, number of workers in a factory, etc. are discrete variables. Most of the discrete variables have integral values.

Definition:

A variable taking all possible values in a certain range is called a continuous variable.

For example: Weight of a person, length of a screw produced by a machine, temperature at a certain place, agricultural production, electricity consumption of a family, speed of a vehicle are the examples of continuous variables.

It is observed that many continuous variables such as marks, income, weight of a person etc. look like discrete variables after the measurement. This is mainly due to the limitations of the measuring instruments. Using better instruments one can have accurate measurement and overcome this difficulty.

The following diagram summarizes the various types of data:

                   Data
                /       \
            Variable    Attribute
             /      \
    Quantitative   Constant
       /     \
Discrete  Continuous
            

Directional Data and Circular Scale:

Some variables are cyclic or rhythmic in nature according to time.

For example: Blood pressure, reproductive cycles, body temperature, mental alertness, sleep-wake cycles, hormonal pulsatility. Such variables are called as biological rhythms control characteristics. The corresponding data is considered as directional data or circular data. Moreover direction of wind, direction of earth's magnetic pole, direction of birds movement, direction of river flow are the examples of directional data. The variables measured in angles i.e. on circular scale rather than linear scale are called as directional data or circular data.

The statistical tools such as mean variance used in usual manner do not remain meaningful and suitable. The statistical methods to be used in these instances are entirely different.

Directional data is also observed in circular movements of automobiles parts, oceanography, travel of ships etc.

The details are beyond the scope of book. The introduction of such non-trivial data and corresponding situation is the only purpose.

Collection and Organization of Data

Collection of data is a very important work and needs to be done carefully. One has to decide the objectives clearly before collecting the data. In order to determine dependable and reliable results, proper data should be collected in a proper way. The data according to the method of collection are of two types viz., (a) Primary data, (b) Secondary data.

Apart from the method of collection the type of data according its nature are also in existence. (viz. time series data, cross-sectional data).

(a) Primary Data:

Primary data means original data (i.e. facts and figures) obtained by an investigator himself. Primary data may be a result of a survey or enquiry conducted. This may be regarded as first-hand information. Population census results, is a classical example of primary data. Primary data are also called as raw data. No doubt, primary data are more reliable than any other type, but are expensive and time consuming.

Primary data are collected by the following methods:

  • Direct personal investigation or interview.

In this method, the investigator meets concerned persons known as 'informants' and collects necessary information by the process of interview. Investigator should be thorough in handling problems of investigation. This will result in reliable data. Investigator has to go up to the source of original information. For example, if he wants to know the amount of production, in a particular industry, should collect the figures by visiting the machine floor, rather than from office or bulletin. This is the best method of collecting primary data. However, the investigator has to take certain precautions.

  • Indirect oral investigation.
  • Investigation through questionnaire.

(b) Secondary Data:

Data taken from sources like office records, bulletins, reports, etc. which are already collected by some other agency is called 'secondary data.

The data which are already collected may be tabulated, classified, ordered, etc. Hence, it is called processed or finished data. Thus, secondary data can also be called finished product.

'Secondary data' is a relative term.

For example, if 'A' collects original data, then it becomes primary data for him; whereas if the same data is used by B, then it becomes secondary data for B. In this case, the only difference is that the user of secondary data may not have a thorough understanding of the background as the user of primary data has.

Difference between Primary and Secondary Data:

  • (a) The main difference lies in the method of collection.
  • (b) Primary data are original in nature. Hence those are more accurate than secondary data.
  • (c) Collection of primary data is expensive as well as time-consuming.
  • (d) Primary data can be elicited in accordance with the objectives of a study. Secondary data may fail in this regard.

The methods of data collection are (i) surveys, (ii) laboratory experiments, (iii) simulation.

Surveys:

With the help of sample surveys or complete enumeration primary or secondary data can be collected.

Laboratory Experiments:

The observations generated in laboratory experiments will be a method of data collection.

Simulation:

Some experiments cannot be conducted in laboratory. For example, genetic experiments, experiments with hazardous material or radioactive material. In such cases, nowadays the data are generated using simulation techniques with the help of computers. It has tremendous scope in industry, business, etc.

For example, how many counters or salesmen are required in a departmental store can be simulated using queueing theory.

The Other Types of Data:

There is yet another angle of looking at data. Earlier we have considered the way of collection. However, the type of data exists due to the nature of data and some other characteristics. If we consider the data when it was collected. Thus, we introduce the time characteristics. It gives rise to the data specially termed as time series data. Sometimes at a fixed time moment we collect data, where time is considered but held constant. Such data are referred to as cross-sectional data. The specific definitions are as follows:

Time Series Data:

The data arranged in the chronological order (as per the order of occurrence) are called as time series data.

For example:

  • Daily sales of a departmental store.
  • Daily electricity consumption of a town.
  • Price of gold recorded daily.

Cross-sectional data:

The values of variables observed at a particular time at several places or on several objects are called as cross-sectional data.

Descriptive Statistics

  • Attributes, Variables and Types of Data
  • Presentation of Data
  • Measures of Central Tendency
  • Measures of Dispersion
  • Moments, Skewness, and Kurtosis
  • Theory of Attributes
  • Correlation

Discrete Probability Distribution

  • Sample Space and Events
  • Probability
  • Conditional Probability and Independence
  • Univariate Discrete Probability Distributions
  • Mathematical Expectation (Univariate)
  • Bivariate Discrete Probability Distribution
  • Mathematical Expectation (Bivariate)

Post a Comment

Previous Post Next Post

Contact Form