group and cogroup in pig
now i will load the file
============================================================
from local system to grunt
grunt> st1 = load'/home/sandeep/student'
USING PigStorage(',')as(id:int, firstname:chararray, lastname:chararray, age:int, phone:chararray, city:chararray);
i wantt to group the dataset according to age.
So i will appaly group operator
-------------------------------------------------------------
----------------------
gAge = GROUP st1 by age;
dump gAge
;
cogroup
------------------------
Normally group work with single relation where as cogroup works with multiple relation
so we
load the same daa set as 2 different dataset.
You can also load 2 diff dataset.
for convenience i take 1 dataset and stored it using 2 dif variable relation name
-------------------------------------------------------------------------------
st2 = load'/home/sandeep/student' USING PigStorage(',')as
(id:int, firstname:chararray, lastname:chararray, age:int, phone:chararray, city:chararray);
st3 = load'/home/sandeep/student' USING PigStorage(',')as(id:int, firstname:chararray, lastname:chararray,
age:int, phone:chararray, city:chararray);
st4 = COGROUP st2 by age, st3 by age;
dump st4
============================================================
from local system to grunt
grunt> st1 = load'/home/sandeep/student'
USING PigStorage(',')as(id:int, firstname:chararray, lastname:chararray, age:int, phone:chararray, city:chararray);
i wantt to group the dataset according to age.
So i will appaly group operator
-------------------------------------------------------------
----------------------
gAge = GROUP st1 by age;
dump gAge
;
cogroup
------------------------
Normally group work with single relation where as cogroup works with multiple relation
so we
load the same daa set as 2 different dataset.
You can also load 2 diff dataset.
for convenience i take 1 dataset and stored it using 2 dif variable relation name
-------------------------------------------------------------------------------
st2 = load'/home/sandeep/student' USING PigStorage(',')as
(id:int, firstname:chararray, lastname:chararray, age:int, phone:chararray, city:chararray);
st3 = load'/home/sandeep/student' USING PigStorage(',')as(id:int, firstname:chararray, lastname:chararray,
age:int, phone:chararray, city:chararray);
st4 = COGROUP st2 by age, st3 by age;
dump st4
#cogroup in pig
#cogroup operator in pig in bigdata
Comments
Post a Comment