Thursday, June 21, 2012

Unix Day - 2 - DB Connection and File Reading

Reading file and passing parameter to Oracle SQL to fetch the data.

#!/bin/sh
rm -r FinalDepInfo.txt
for row in `cat myDep.txt`
do

a=$(C:/oraclexe/app/oracle/product/10.2.0/server/BIN/sqlplus -s hr/magogate@XE <<QUERY
spool mytest.txt
select first_name, last_name, email
from employees
where department_id = $row;
spool off;
exit;
QUERY);
less mytest.txt >> FinalDepInfo.txt
done;


Reading file line by line

#!/bin/sh
while read line
do
    echo $line
done < FinalDepInfo.txt

Sunday, June 10, 2012

Dynamic Lookup..

The "Dynamic Look-up" implementation in any project I have not seen personally. People say that it get used in SCD-2 type implementation, but even that is not mandatory; In my last project, to implement SCD-2 type, they had used static lookup itself.

But, I came across one scenario wherein it was a need to impement Dynamic Lookup. In my project they have used static lookup, when it was giving problem in particular situation.

We had layes or a data flow as below
1. Source System which was OP
2. Staging tables pulling data from OP.
3. Dimention and Facts tables which were pulling data from Staging.

In staging, most of the tables were truncate and reload, and indexes defined on them used to drop and re-create after every staging data load initiation and completion using pl-sql script. 

Data was getting picked up for staging load was using below condition at source qualifier.

Select *
from ods_dw_op_per_all_people_f_stg
where ges_update_date >=
(select max(last_successful_completion_date) - 3
from dw_jobs
where map_name = <Hard Coded Value>)

So, here ods_dw_op_per_all_people_f_stg is our source system table. Its actually a synonym at target database. Using ges_update_date condition, we can pick the latest data from source system. This technique will work fine most of the times i.e. 99%.  But consider a situation like below.

We have below data which is coming on 11-Jun-2012
Id - 1
Name - Mandar
SurName - Gogate
City - Mumbai
Location - Girgaon

While processing this data from source system into staging tables, some prior dependent jobs got failed and somehow this data did not got inserted into staging tables. Because of which now at next data load again from 11-Jun-2012 data will get picked up.

At source side, on 12-Jun source data got changed..

Old Records were
Id - 1
Name - Mandar
SurName - Gogate
City - Mumbai
Location - Girgaon

New created records at soruce are
Id - 1
Name - Mandar
SurName - Gogate
City - Pune
Location - Bavdhan

So now at source 2 records exists, one with ges_uddate_date as 12-Jun and other with 13-Jun. Our source qualifier for staging load will pick both the records together and will try to insert those into staging level. Since, there is a index defined on employee_id, this insertion will ultimately fail.

In the normal scenario, if job would not have failed... on 11-Jul first records would have got inserted at staging table. and on 12-Jul again even if 2 recrods got picked up by Source Qualifier, those would have got updated [as there is a update strategy defined at mapping level].

The solution for this situation is "Dynamic Looktup"; which will pass only one value ahead.. and in its buffer its buffer itself it will mark rows as insert or update.

Wednesday, June 6, 2012

Migration Process - PADA or PVCS

Approximately 2 years ago, was in CISCO and following the process of PVCS for informatica workflow migrations from Dev to QA and after that from QA to PROD. Now here at CS, the name is different i.e. PADA but process is same...

Just give the label to Dev objects, mention the workflows and folder.. application will migrate those informatica objects to QA and subsequently to PROD.

I never asked myself, that how this migration happens? What will be the code behind it? Always thought like people are so genious that they are migrating this thing automatically..

Yesterday while checking with Nilesh, realized that this process is quite easy. Below is the entire description of it...

Problem -- Nilesh wants to delete [not purge] some objects from PROD repository as those were obsolate. but as number of objects were more, Infra admin will not be able to do it manually. So, what will be the proces to delete those?
We got one way like below..

At Dev repository
1.
     At informatica repository manager, create a query which will display all objects which you want to delete
or just apply label to required workflow [selecting all child / parents while applying label] and create a query from repository manager giving that label as search criteria.

2. Right click on result set and click on "Change Object Status" and change the object status to Deleted.

3. After changing the status make sure that you will check in all those objects; Just click on folder and do find checkouts and slect all results and simply do check in.

After doing the 3rd step, you have completed deletion of Infra objects from Dev repository. Now next steps are to migrate those deleted objects to QA/PROD

4. Once you deleted all objects, create a query to find out deleted objects. Give "version status = deleted" as a criteria. This query will display you all deleted objects. Right click on result set and click on apply label. Apply a new label [make sure you click on apply to children and parents].

5. Create new query for Deployment group, giving criteria as "version status=deleted" and "label contains XXX". Save this query with specific name.


6. Create a dynamic deployment group and give above query as an input to it.

7. Just drag and drop this deployment group to new Repository. Just click on next button [no need to apply any label in this process] and your deployment will get done.

By doing this, you will successfully delete objects at QA / PROD.

But, here problem which nilesh was facing was different. Till step 6 nilesh has done correctly, but step 7 we can not do it manually. This step has to be done via raising a PADA request, and migration failed at PADA request.

When we checked for a log at PADA execution, we came across below main steps
1. PADA was creating a label for source repository folder using createlable pmrep command
2. PADA was creating a label for target repository folder using createlable pmrep command
3. PADA was calling deploymentgroup which we have created in step 6 using deploydeploymentgroup pmrep command

And at Step3, it was failing giving error as --

       Error: Apply label failed for Source Definition object EMPLOYEES version
6 in repository EDWPROD because this version of the object has been purged. This
 object was skipped.
Now, when we tried deploying DeploymentGroup manually [Drag and Drop] it worked fine, but from PADA it gave us an error. So, we thought of diong this process same as PADA i,e. via command line execution.

For executing Deployment Group from Command line below is the command, you will get that even at help file

First go to C:\Informatica\PowerCenter8.6.0\server\bin folder from command line and execute command pmrep.

Now execute deploydeploymentgroup command giving below parameters as an input

pmrep>deploydeploymentgroup -p Migration -c MigrationTest.xml -r EDWPROD -h infr
a-34af3549f -o 6001 -n Administrator -x magogate
-p -- DeploymentGroup
-c -- Contorl file
-r -- Repositroy
-h -- Host
-o -- Port
-n -- User
-x -- Password

Now, out of alll these, only Control file as missing at our end. We did the search at help and with respect to standard format we created new one as below

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE DEPLOYPARAMS SYSTEM "depcntl.dtd">
<DEPLOYPARAMS DEFAULTSERVERNAME="Administrator"
COPYPROGRAMINFO="YES"
COPYMAPVARPERVALS="YES"
COPYWFLOWVARPERVALS="YES"
COPYWFLOWSESSLOGS="NO"
COPYDEPENDENCY="YES"
LATESTVERSIONONLY="YES"
RETAINGENERATEDVAL="YES"
RETAINSERVERNETVALS="YES">
<DEPLOYGROUP CLEARSRCDEPLOYGROUP="NO">
<OVERRIDEFOLDER SOURCEFOLDERNAME="ASA"
SOURCEFOLDERTYPE="LOCAL"
TARGETFOLDERNAME="ASA_PROD"
TARGETFOLDERTYPE="LOCAL"/>
<APPLYLABEL SOURCELABELNAME="NEW_SRC_LABEL_NAME"
SOURCEMOVELABEL="YES"
TARGETLABELNAME="NEW_TGT_LABEL_NAME"
TARGETMOVELABEL="YES" />
</DEPLOYGROUP>
</DEPLOYPARAMS>

As in contorl file, we have mentioned source and target labels, accordingly we have created those at source and target repositories. If you do not create those labels, command will fail giving error that mentioned labels do not exists.

We stored this contorl file at same directory as pmrep and ran the above mentioned command and this time we got same error which Nilesh got at office..

Applying label "NEW_SRC_LABEL_NAME" in the source repository...
Applying label "NEW_TGT_LABEL_NAME" in the target repository...
       Error: Apply label failed for Source Definition object EMPLOYEES version
6 in repository EDWPROD because this version of the object has been purged. This
 object was skipped.
So, what went wrong? We thought its definetly because of labels, and to verify that one, we removed the lable tag from control file.. New control file is like below

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE DEPLOYPARAMS SYSTEM "depcntl.dtd">
<DEPLOYPARAMS DEFAULTSERVERNAME="Administrator"
COPYPROGRAMINFO="YES"
COPYMAPVARPERVALS="YES"
COPYWFLOWVARPERVALS="YES"
COPYWFLOWSESSLOGS="NO"
COPYDEPENDENCY="YES"
LATESTVERSIONONLY="YES"
RETAINGENERATEDVAL="YES"
RETAINSERVERNETVALS="YES">
<DEPLOYGROUP CLEARSRCDEPLOYGROUP="NO">
<OVERRIDEFOLDER SOURCEFOLDERNAME="ASA"
SOURCEFOLDERTYPE="LOCAL"
TARGETFOLDERNAME="ASA_PROD"
TARGETFOLDERTYPE="LOCAL"/>
</DEPLOYGROUP>
</DEPLOYPARAMS>

Again we ran the command with same input and this time it did the deployment successfully and objects at EDWPROD repository got deleted.

This proves that, if you are migrating deleted objects from command line, do not give label at control file.

Deployment log will be as follows
---------------------------------------------------------------------------------
pmrep>deploydeploymentgroup -p Migration -c MigrationTest.xml -r EDWPROD -h infr
a-34af3549f -o 6001 -n Administrator -x magogate

--------------------------------------------------------------------------------
Starting copy at 01/07/2009 23:44:10
--------------------------------------------------------------------------------


Creating temporary tables...
Obtaining locks...
Deploy trying to acquire shared lock on Repository EDWPROD in target repository.

Deploy trying to acquire shared lock on Deployment Group Migration in source rep
ository.
Deploy trying to acquire shared lock on Folder ASA in source repository.
Deploy trying to acquire subtree shared lock on Folder ASA in source repository.


Finding Dependencies...
Finding dependent database connections...
Finding dependent FTP connections...
Finding dependent External loader connections...
The Repository Service is finding dependent message queue connections.
Finding dependent External loader connections...
Finding dependent servers...
Finding dependent local shared folders...
Checking for any global shortcuts in this folder...
No global shortcuts detected.
Objects in source folders [ASA] will be copied to the following target folders [
ASA_PROD] respectively.
No mapping variable persisted values found in the repository for this folder.
No workflow variable persisted values found in the repository for this folder.
Existing CurrentValue's for Sequence Generator and Normalizer transformations or
 XML generated keys will be retained.
Existing Integration Service-grid-related values in workflows and session instan
ces will be retained.
Copying Deployment Group Migration
Copying table OPB_MAP_PERSISVAL...
Deploy trying to acquire exclusive lock on Repository EDWPROD in target reposito
ry.
Copying temporary tables...
Saving group deployment history...
Clearing source deployment group...
Copying deleted objects:
Source Definition object "
EMPLOYEES@XE" from folder "ASA"
Source Definition object "
DEPARTMENTS@XE" from folder "ASA"
Target Definition object "EMPLOYEESDUMMY" from folder "ASA"
Target Definition object "DEPARTMENTS" from folder "ASA"
Mapping object "m_DeploymentGroupTest" from folder "ASA"
Workflow object "wf_DeploymentGroupTest" from folder "ASA"
The following deleted objects are not be deployed because they do not exist in t
he target repository:
Workflow object "wf_DeploymentTest" from folder "ASA"
Committing changes, deployed objects are now checked out...
Checking in checked out objects...
Making nonreusable objects reusable if needed...
Copy was successful, committing changes...


--------------------------------------------------------------------------------
Completed at 01/07/2009 23:44:14
--------------------------------------------------------------------------------


deploydeploymentgroup completed successfully.

Thursday, May 24, 2012

Executing informatica query from command prompt

In every difficulty there is an opportunity, yes its very true. Today after manager asked me to automate something, there was a big question mark on my head that how will I do it? will it really work? and after coming at home just checked informatica help and they have give it very straight forward..

How will you execute the informatica query from command prompt?

My task was to find out all the checked out object in a repository, I am doing that by creating a query at Repository Manager. But main pain area was to automate this query. To automate this, only way is to run a shell or batch script, but how will I call that.. so below is the way for it.

1. First create a query from Repository Manager; that is a simple task.

2. Execute the query from command prompt

You can do this using pmrep command of informatica. pmrep.exe file will be located at C:\Informatica\PowerCenter8.6.0\server\bin path [if you have installed informatica at windows machine]. From command prompt just go till this directory.

Before actually executing your query, you have to first connect to the repository. If you direct try to execute the query, you will get an error message saying you first have to connect to repository.

How will you connect to Repository--? so that will be via Connect command as below

C:\Informatica\PowerCenter8.6.0\server\bin>pmrep connect -r EDWDEV -d Domain_inf
ra-34af3549f [-h infra-34af3549f -o 6001] -n Administrator -s native -x magogate

Here host and ports are optional, if you are specifying Domain then do not specify host and port. Using this command you will be able to connect to repository.  You will only require to specify Repository, Domain, User, Security Domain, and Password.

Once you get connected to repository last task is to run a query

C:\Informatica\PowerCenter8.6.0\server\bin>pmrep executequery -q CheckedOutObjec
ts -u CheckedOutObjectsList.txt

Here, we have specified -q <Query Name> and -u <OutputFileName>

If you do not specify outputfilename, results will get displayed at command prompt itself..

Please find below log which I have tried at my system

You can also view the video for it at
https://www.youtube.com/watch?v=p-hzjTuqPMI
=============================================================

C:\Documents and Settings\Administrator>cd C:\Informatica\PowerCenter8.6.0\serve
r\bin

C:\Informatica\PowerCenter8.6.0\server\bin>pmrep executequery -q CheckedInObject
 -u CheckedInObject.txt

Informatica(r) PMREP, version [8.6.0], build [178.0613], Windows 32-bit
Copyright (c) Informatica Corporation 1994 - 2008
All Rights Reserved.
This Software may be protected by U.S. Patent Numbers 6,208,990; 6,044,374; 6,01
4,670; 6,032,158; 5,794,246; 6,339,775; 6,850,947; 6,895,471; 7,254,590 and othe
r U.S. Patents Pending.

Invoked at Thu Jan 08 00:21:32 2009
Failed to read repository connection information. Please first connect to the re
pository using the connect command.
Failed to execute executequery.

Completed at Thu Jan 08 00:21:32 2009

===================
C:\Informatica\PowerCenter8.6.0\server\bin>pmrep connect -r EDWDEV
Informatica(r) PMREP, version [8.6.0], build [178.0613], Windows 32-bit
Copyright (c) Informatica Corporation 1994 - 2008
All Rights Reserved.
This Software may be protected by U.S. Patent Numbers 6,208,990; 6,044,374; 6,01
4,670; 6,032,158; 5,794,246; 6,339,775; 6,850,947; 6,895,471; 7,254,590 and othe
r U.S. Patents Pending.

Invoked at Thu Jan 08 00:24:09 2009
 [01/08/2009 00:24:09-[REP_12164] Domain-related error: [[PCSF_46006] Domain nam
e is not specified. [FAULT_PCSFUSEREXCEPTION]].]
 [Failed to connect to repository service [EDWDEV].]
An error occurred while accessing the repository[Failed to connect to repository
 service [EDWDEV].]
 [01/08/2009 00:24:09-[REP_55102] Failed to connect to repository service [EDWDE
V].]
Repository connection failed.
Failed to execute connect.

Completed at Thu Jan 08 00:24:09 2009
C:\Informatica\PowerCenter8.6.0\server\bin>pmrep connect -r EDWDEV -d Domain_inf
ra-34af3549f -h infra-34af3549f -o 6001 -n Administrator -s native -x magogate

Informatica(r) PMREP, version [8.6.0], build [178.0613], Windows 32-bit
Copyright (c) Informatica Corporation 1994 - 2008
All Rights Reserved.
This Software may be protected by U.S. Patent Numbers 6,208,990; 6,044,374; 6,01
4,670; 6,032,158; 5,794,246; 6,339,775; 6,850,947; 6,895,471; 7,254,590 and othe
r U.S. Patents Pending.

Invoked at Thu Jan 08 00:26:32 2009
If -d is specified, do not specify -h or -o.
Failed to execute connect.

Completed at Thu Jan 08 00:26:32 2009
C:\Informatica\PowerCenter8.6.0\server\bin>pmrep connect -r EDWDEV -d Domain_inf
ra-34af3549f -h infra-34af3549f -o 6001 -n Administrator -s native -x magogate

C:\Informatica\PowerCenter8.6.0\server\bin>pmrep connect -r EDWDEV -d Domain_inf
ra-34af3549f -n Administrator -s Native -x magogate

Informatica(r) PMREP, version [8.6.0], build [178.0613], Windows 32-bit
Copyright (c) Informatica Corporation 1994 - 2008
All Rights Reserved.
This Software may be protected by U.S. Patent Numbers 6,208,990; 6,044,374; 6,01
4,670; 6,032,158; 5,794,246; 6,339,775; 6,850,947; 6,895,471; 7,254,590 and othe
r U.S. Patents Pending.

Invoked at Thu Jan 08 00:28:11 2009
Connected to repository EDWDEV in Domain_infra-34af3549f as user Administrator
connect completed successfully.

Completed at Thu Jan 08 00:28:16 2009
C:\Informatica\PowerCenter8.6.0\server\bin>
==================
C:\Informatica\PowerCenter8.6.0\server\bin>pmrep executequery -q CheckedInObject

Informatica(r) PMREP, version [8.6.0], build [178.0613], Windows 32-bit
Copyright (c) Informatica Corporation 1994 - 2008
All Rights Reserved.
This Software may be protected by U.S. Patent Numbers 6,208,990; 6,044,374; 6,01
4,670; 6,032,158; 5,794,246; 6,339,775; 6,850,947; 6,895,471; 7,254,590 and othe
r U.S. Patents Pending.

Invoked at Thu Jan 08 00:29:25 2009
mapping m_Dynamic_Target
.
The total number of records returned: 1
executequery completed successfully.

Completed at Thu Jan 08 00:29:30 2009

C:\Informatica\PowerCenter8.6.0\server\bin>pmrep executequery -q CheckedOutObjec
ts -u CheckedOutObjectsList.txt

Informatica(r) PMREP, version [8.6.0], build [178.0613], Windows 32-bit
Copyright (c) Informatica Corporation 1994 - 2008
All Rights Reserved.
This Software may be protected by U.S. Patent Numbers 6,208,990; 6,044,374; 6,01
4,670; 6,032,158; 5,794,246; 6,339,775; 6,850,947; 6,895,471; 7,254,590 and othe
r U.S. Patents Pending.

Invoked at Thu Jan 08 01:07:04 2009

The total number of records returned: 50
executequery completed successfully.

Completed at Thu Jan 08 01:07:09 2009
C:\Informatica\PowerCenter8.6.0\server\bin>

Sunday, May 20, 2012

Partitioning : Things which did not worked for me !

Partitioning - now even heraing this word gives me a shock, for last 20+ days I am quite frustrated, just because of Informatica Partitioning.

I am not sure where it works correctly and actually increases the performance (increases throughput at source and eventually at target), but I definetly know where id does not work. Being optimistic and showing positive attitude , I am describing where I failed and Partitioning this which did not worked for me.

Before describing actual problem, I just walk through the climex ...

I have a mapping, which has source as a csv file (having 5-Million records in it) and target as a partition table (partitioned based on date) and inbtween source and target there are number of tranformations like joiner, sorter, normalizer.

My Manager has forcee the performance problem because in future the source records are going to get increase. Currently source file do not have 5 million records. Earlier he asked my on-site co-ordinator to work on this and to check if adding partitioning at infra level, improves the performance or not. After onsite-corodinartor got released from the project, that task ultimately got assigned to me.

First Hurdle -- At first run, instead of getting any kind of performance gain, mapping started taking hell lot of time. I tried number of things at session paramter level, but not a single thing helped. After that I tried enabling "Collect Performance Statistics" and while going through that realized that data getting loded at target it quite more than expected.

When I looked at it closely, came to know that from one of the relational source, data was coming duplicate. That's because, when we add partition, say 3 for an example. all 3 partitions will fetch the data from relational source, whereas that's not the case with file as a source. In FileSource, if you specify input to only one partition, informatica automatically transfer data to remaining partition, so data will get divided into number of partitions. But in case relational source, if you do not specify proper filter condition at each partiion, same data will be fected by all partitions.

If you check at "Partitioning Joiner Tranformation" there u will find there are 2 types
1. 1:n partitioning
2. n:n partitioning

In case of a joiner tranformation, you can only partition Detail source if you do not want to add partition point at Joiner level that is called as 1:n partitionng

In case of n:n partitioing, you can add Partition Point at Joiner but with that you have to add a partition at master level too. Here, you have to work cauciously. If your master is relational source, add condition 1=2 except one partition to restrict duplicate data.

2. After completing this hurdle, I ran mapping again, and got success at certain level. I would not say throughput / performance was fantastic, but at least much more positive that earlier.

With my past experience, I modified below parameters at session level
a. Default buffer block size -- chaned to 1000000
b. Max Percentage of total memory allowed for auto memory attributes -- changed to 25
c. Commit Interval -- Changed ot 1000000
d. DTM Buffer Size -- Changed to 1GB, 2GB from auto

This helped me to gain performance for single or uptop 4-5 simultaneous sessions. But as I kicked off all 10 workflows together, perforamnce went down, and sometime un-partition sessions were working much better than the partitioned sessions.

3. I had a discussion with Nilesh [Seems I bugged nilesh very much, but for me he was the only guy to expect some help from informatica or oracle point of view], when we checked the session logs, we came to conclusion that buttleneck might be at target level.

At session logs, it was the Target Thread which was always busy more than 98%+ and that too for singnificat amount of time. So, what was the to overcome that.. then we tried Oracle Loader utility.

4. Created Oracle Loader connection at Workflow level, and changed the session paramters. It got failed, tried changing number of things, but all efforts went into vein. I was totally frustrated. Left that option aside, and started working on other prod issues.

Again on next day, when I looked closly at logs, it was telling me that the password is incorrect. So, instead of QA, changed the loader connection to Dev db, and again kicked off the session, but still no success. This time at least control file, input files and logs files got created. Input file was empty, and control file was showing inserted 0 rows.

I was expecting informatica to create 11 input files, and if not same number of control files but in same contorl files I was expecting entries for 11 input files. But that was not the case. There was only 1 input file and same entry was there in control file. Not sure why this is happening, but seems informatica always creates only one control files and only one input file.

Still I have not got any success in performance but lot many questions are there
1. Why performance is going down at the time of insertion? Is that because of Constraints? or because of concurrent insertions?

2. Why informatica creates only 1 contorl and input file? what we have to do if we are running multiple session/tasks in a single workflow?

Will update this blog if I get any answers for above questions, but till the time, for my manger, what every I did is just a waste. For him, whole time/work is good for nothing, end of the day my output is 0. :-(


Tuesday, May 8, 2012

How to pass Target File Name dynamically ?

Had a discussion with Santhosh Garu regarding target file creation but passing its name from mapping itself and appening the name with date. I had a discussion about this with Nilesh earlier, and even I had tried it at that time.

Again today I did it, I passed the file name from mapping with date appened to it. Below are the steps for it.

1. Import the target file and after importing it, click on the "Add FileName Columns to this table icon". Once you click on it, automatically new port will get added to target table.

2. Import source (File or Relational)
3. Creation a simple mapping.

Source --> Expression (to specify the filename) --> Target

In the expression you specify below as a filename'HelloWorld_' || TO_CHAR(SYSTIMESTAMP(),'DD-MM-YYYY')
4. Create a workflow and specify target directory and target filename at session properties level.
5. file name which you give at session level will get over-ridden by file name which you will pass from mapping.

Please watch below video for the same.
http://www.youtube.com/watch?v=QCo929SrBsQ&feature=BFa&list=UUZ4jBpZoYcbYNQAOnuV_edg

Monday, April 30, 2012

Unix Commands ..

Today we will practice some Unix commands which usually we use and are really helpful and required when your file size is more than 60K records, because those files you can not open at Excel.
  1. How you will find the row count of a file?
  2. How you will find total number of files inside any directory?
  3. How will you print first 2 lines of file?
  4. How to find number of records without displaying file name in output
  5. Suppose you have 100 records, out of which you want to read 10 to 20 records. You have a header in a file, how will you do that?
  6. I want distinct Department_Id from Employee File
  7. Check with Piyush what command he used to fire for record count without the header record.
  8. I want all Employees with department as 10 ?
  9. I've a flat file where 1st two bytes of each record is an identifier for that record. How do I find distinct count of these records?
  10. How will you compare 2 files
  11. Find out files starting with Emp in a particular directory
  12. How will you find particular name in a file ?
  13. How will you find files in a directory ?
  14. How will you search file in same / current directory?
  15. How will you change the file permissions of all files inside particular directory ?
  16. How will you search all files owned by a particular user
  17. How will you replace character / string in a particular file with another character / string ?
  18. How will you sort files by their size in a particular direcotry?
  19. How will you sort files by their timestamps
  20. To find out total size of a folder
  21. How will you delete particular content of a file
  22. How will you print last 2 lines of a file
  23. How to find folders in particular directory
  24. Thanks to Nikhil Kulkarni for below Unix Part
  25. directory commands
  26. Process commands
  27. Sort files by date
  28. sort files by size
  29. grep
  30. find
  31. Creating Files
  32. Chmod command.
  33. File Operations
  34. Wc command
  35. Mv command.
  36. Rm command.
  37. Rmdir command
  38. Cut command
  39. Paste Command
  40. Cmp command.
  41. Sort command
  42. Head command
  43. Tail command
  44. Crontab command

1. How you will find the row count of a file?
Mandar@Mandar-PC /MyFiles
$ wc -l Employees.csv
108 Employees.csv

2. How you will find total number of files inside any directory?
Mandar@Mandar-PC /MyFiles
$ ls | wc -l
8

3. How will you print first 2 lines of file?
Mandar@Mandar-PC /MyFiles
$ head -2 Employees.csv
employee_id,first_name,last_name,email,phone_number,hire_date,job_id,salary,commission_pct,manager_id,department_id
100,Steven,King,SKING,515.123.4567,6/17/1987,AD_PRES,24000,null,null,90


4. How to find number of records without displaying file name in output
Mandar@Mandar-PC /MyFiles
$ awk '{n++} END {print n}' Employees.csv
108


Even this will work same way Mandar@Mandar-PC /MyFiles
$ wc -l < Employees.csv
108

5. Suppose you have 100 records, out of which you want to read 10 to 20 records. You have a header in a file, how will you do that?
Mandar@Mandar-PC /MyFiles
$ head -20 Employees.csv | tail -10
109,Daniel,Faviet,DFAVIET,515.124.4169,8/16/1994,FI_ACCOUNT,9000,null,108,100
110,John,Chen,JCHEN,515.124.4269,9/28/1997,FI_ACCOUNT,8200,null,108,100
111,Ismael,Sciarra,ISCIARRA,515.124.4369,9/30/1997,FI_ACCOUNT,7700,null,108,100
112,Jose Manuel,Urman,JMURMAN,515.124.4469,3/7/1998,FI_ACCOUNT,7800,null,108,100
113,Luis,Popp,LPOPP,515.124.4567,12/7/1999,FI_ACCOUNT,6900,null,108,100
114,Den,Raphaely,DRAPHEAL,515.127.4561,12/7/1994,PU_MAN,11000,null,100,30
115,Alexander,Khoo,AKHOO,515.127.4562,5/18/1995,PU_CLERK,3100,null,114,30
116,Shelli,Baida,SBAIDA,515.127.4563,12/24/1997,PU_CLERK,2900,null,114,30
117,Sigal,Tobias,STOBIAS,515.127.4564,7/24/1997,PU_CLERK,2800,null,114,30
118,Guy,Himuro,GHIMURO,515.127.4565,11/15/1998,PU_CLERK,2600,null,114,30


6. I want distinct Department_Id from Employee File
Mandar@Mandar-PC /MyFiles
$ cut -d"," -f11 Employees.csv | sort | uniq
10
100
110
20
30
40
50
60
70
80
90
department_id
null

Give man cut so that you will get all parameter informationuniq will output all lines exactly once:
uniq -d will output all lines that appear more than once, and it will print them once:
uniq -u will output all lines that appear exactly once, and it will print them once


7. Check with Piyush what command he used to fire for record count without the header record.


8. I want all Employees with department as 10 ?
Mandar@Mandar-PC /MyFiles/NewFolder
$ awk -F"," '{ if($11=10) print $0}' Employees1.csv > Dep10File.csv


9.I've a flat file where 1st two bytes of each record is an identifier for that record. How do I find distinct count of these records?
Flat file ex:
AB|123|hello moto|
AB|456|googly|
BC|sick day|booya|
ID|inter doing|hahahah

So I want to find distinct record count... it should give me
AB 2
BC 1
ID 1

Mandar@Mandar-PC /MyFiles
$ awk -F'|' '{++c[$1]}END{for (i in c)print i, c[i]}' SampleData.txt
AB 2
BC 1
ID 1


10. How will you compare 2 files
Mandar@Mandar-PC /MyFiles
$ comp mandar.txt myname.txt
Comparing mandar.txt and myname.txt...
Compare error at OFFSET 6
file1 = A
file2 = 20
Compare error at OFFSET 10
file1 = A
file2 = 20
Compare more files (Y/N) ? n

Mandar@Mandar-PC /MyFiles
$ diff mandar.txt myname.txt
1,3c1
< mandar
< raghunath
< gogate
---
> mandar raghunath gogate

Mandar@Mandar-PC /MyFiles
$ sdiff mandar.txt myname.txt
mandar | mandar raghunath gogate
raghunath <
gogate <


11. Find out files starting with Emp in a particular directory
Mandar@Mandar-PC /MyFiles
$ ls -l | grep Emp*
-rwxr-xr-x 1 Mandar None 8314 May 1 00:39 Employees.csv


12. How will you find particular name in a file ?
Mandar@Mandar-PC /MyFiles
$ grep Sundi* Employees.csv
166,Sundar,Ande,SANDE,011.44.1346.629268,3/24/2000,SA_REP,6400,0.1,147,80
173,Sundita,Kumar,SKUMAR,011.44.1343.329268,4/21/2000,SA_REP,6100,0.1,148,80


13. How will you find files in a directory ?
Mandar@Mandar-PC /MyFiles
$ find -name 'Emp*'
./Employees.csv
./Employees1.csv
./Employees10.csv
./Employees11.csv
./Employees12.csv
./Employees2.csv
./Employees3.csv
./Employees4.csv
./Employees5.csv
./Employees6.csv
./Employees7.csv
./Employees8.csv
./Employees9.csv

As files were inside MyFile folder itself, it gave the search result like this. but same command works for finding files in sub - direcotry also. Now I have moded all Empl* files to new directory called NewFolder which is inside MyFiles directory
Mandar@Mandar-PC /MyFiles
$ find -name 'Emp*'
./NewFolder/Employees.csv
./NewFolder/Employees1.csv
./NewFolder/Employees10.csv
./NewFolder/Employees11.csv
./NewFolder/Employees12.csv
./NewFolder/Employees2.csv
./NewFolder/Employees3.csv
./NewFolder/Employees4.csv
./NewFolder/Employees5.csv
./NewFolder/Employees6.csv
./NewFolder/Employees7.csv
./NewFolder/Employees8.csv
./NewFolder/Employees9.csv

This command searches even the directory name and not only the file name
Mandar@Mandar-PC /MyFiles
$ find . -name New*
./NewFolder

To find only the files and exclude directories Mandar@Mandar-PC /MyFiles
$ find . -type f -name New*

14. How will you search file in same / current directory?
To search in all directories Mandar@Mandar-PC /MyFiles
$ find / -type f -name Emp*

15. How will you change the file permissions of all files inside particular directory ?
Mandar@Mandar-PC /MyFiles
$ find ./NewFolder/ -exec chmod 777 {} \;

16. How will you search all files owned by a particular user
Mandar@Mandar-PC /MyFiles
$ find . -user mandar

17. How will you replace character / string in a particular file with another character / string ?
Mandar@Mandar-PC /MyFiles/NewFolder
$ less Employees1.csv | tr 'manager_id' 'MANAGER_ID' | less
Above command will give display character with MANAGER_ID at command line You can redirect this output to new file with below command
Mandar@Mandar-PC /MyFiles/NewFolder
$ less Employees1.csv | tr 'manager_id' 'MANAGER_ID' > MyNewFile.csv

18. How will you sort files by their size in a particular direcotry?
Mandar@Mandar-PC /MyFiles/NewFolder
$ du -s * | sort -n
0 Employees.csv
12 Employees1.csv
12 MyNewFile.csv
20 Employees2.csv
36 Employees3.csv
68 Employees4.csv
180 Employees5.csv
344 Employees6.csv
764 Employees7.csv
2292 Employees8.csv
9160 Employees9.csv
27476 Employees10.csv
137376 Employees11.csv
412128 Employees12.csv

Mandar@Mandar-PC /MyFiles/NewFolder
$ du -a | sort -n
0 ./Employees.csv
1 ./helloworld.txt
12 ./Employees1.csv
12 ./MyNewFile.csv
20 ./Employees2.csv
36 ./Employees3.csv
68 ./Employees4.csv
180 ./Employees5.csv
344 ./Employees6.csv
764 ./Employees7.csv
2292 ./Employees8.csv
9160 ./Employees9.csv
27476 ./Employees10.csv
137376 ./Employees11.csv
412128 ./Employees12.csv
589873 .


19. How will you sort files by their timestamps
Mandar@Mandar-PC /MyFiles/NewFolder
$ ls -ctl | sort -n
-rw-r--r-- 1 Mandar None 11 May 5 12:34 helloworld.txt
-rw-r--r-- 1 Mandar None 8314 May 5 12:20 MyNewFile.csv
-rwxrwxrwx 1 Mandar None 0 May 5 12:09 Employees.csv
-rwxrwxrwx 1 Mandar None 8314 May 5 12:03 Employees1.csv
-rwxrwxrwx 1 Mandar None 16628 May 5 12:03 Employees2.csv
-rwxrwxrwx 1 Mandar None 33256 May 5 12:03 Employees3.csv
-rwxrwxrwx 1 Mandar None 66512 May 5 12:03 Employees4.csv
-rwxrwxrwx 1 Mandar None 182908 May 5 12:03 Employees5.csv
-rwxrwxrwx 1 Mandar None 349188 May 5 12:03 Employees6.csv
-rwxrwxrwx 1 Mandar None 781516 May 5 12:03 Employees7.csv
-rwxrwxrwx 1 Mandar None 2344548 May 5 12:03 Employees8.csv
-rwxrwxrwx 1 Mandar None 9378192 May 5 12:03 Employees9.csv
-rwxrwxrwx 1 Mandar None 28134576 May 5 12:03 Employees10.csv
-rwxrwxrwx 1 Mandar None 140672880 May 5 12:03 Employees11.csv
-rwxrwxrwx 1 Mandar None 422018640 May 5 12:03 Employees12.csv
total 589869


20. To find out total size of a folder
Mandar@Mandar-PC /MyFiles/NewFolder
$ du -h
577M .
22. How will you print last 2 lines of a file
Mandar@Mandar-PC /MyFiles
$ tail -2 Second.sh
echo -e "\n\nFile $1, found and successfully echoed"
fi

23. How to find folders in particular directory
Mandar@Mandar-PC /MyFiles
$ find . -type d
.
./NewFolder
24. directory commands
--------------------
mkdir directoreyname -> creates a new directory
cd directoryname -> Change directory

25.Process commands
-----------------

PS command
ps command is probably the most useful command for systems administrators. It reports information on active processes.
ps options

* options. -a Lists all processes in system except processes not attached to terminals.
* -e Lists all processes in system.
* -f Lists a full listing.
* -j print process group ID and session ID.

ps -e -- lists all proceses with information about process

kill pid -- kills the process


Mandar@Mandar-PC /MyFiles/NewFolder
$ ps -a
PID PPID PGID WINPID TTY UID STIME COMMAND
4104 1912 4104 7052 pty0 1000 11:17:00 /usr/bin/bash
1912 1 1912 1912 ? 1000 11:17:00 /usr/bin/mintty
4764 4104 4764 6688 pty0 1000 13:59:06 /usr/bin/bash
2496 4764 2496 2344 pty0 1000 19:55:13 /usr/bin/ps

Mandar@Mandar-PC /MyFiles/NewFolder
$ ps -e
PID PPID PGID WINPID TTY UID STIME COMMAND
4104 1912 4104 7052 pty0 1000 11:17:00 /usr/bin/bash
1912 1 1912 1912 ? 1000 11:17:00 /usr/bin/mintty
4764 4104 4764 6688 pty0 1000 13:59:06 /usr/bin/bash
5908 4764 5908 3040 pty0 1000 19:55:46 /usr/bin/ps

Mandar@Mandar-PC /MyFiles/NewFolder
$ ps -f
UID PID PPID TTY STIME COMMAND
Mandar 4104 1912 pty0 11:17:00 /usr/bin/bash
Mandar 1912 1 ? 11:17:00 /usr/bin/mintty
Mandar 4764 4104 pty0 13:59:06 /usr/bin/bash
Mandar 5496 4764 pty0 19:56:05 /usr/bin/ps



26.Sort files by date
-------------------
ls - list all files in a directory

-a displays all files
-l Displays the long format listing.
-t Displays newest files first. (based on timestamp)
-r Displays files in reverse order.

ls -ltr : displays files based on timestamp in ascending order.

Only files. Excluding directories
Mandar@Mandar-PC /MyFiles/NewFolder
$ ls -l | wc -l
35

Including default directories. i.e. 2 more
Mandar@Mandar-PC /MyFiles/NewFolder
$ ls -al | wc -l
37

r will sort files output depending upon timestamp
Mandar@Mandar-PC /MyFiles/NewFolder
$ ls -ltr
total 2017219
-rwxrwxrwx 1 Mandar None 8314 May 5 01:17 Employees1.csv
-rwxrwxrwx 1 Mandar None 16628 May 5 01:18 Employees2.csv
-rwxrwxrwx 1 Mandar None 33256 May 5 01:18 Employees3.csv
-rwxrwxrwx 1 Mandar None 66512 May 5 01:18 Employees4.csv
-rwxrwxrwx 1 Mandar None 182908 May 5 01:19 Employees5.csv
-rwxrwxrwx 1 Mandar None 349188 May 5 01:19 Employees6.csv
-rwxrwxrwx 1 Mandar None 781516 May 5 01:20 Employees7.csv
-rwxrwxrwx 1 Mandar None 2344548 May 5 01:20 Employees8.csv
-rwxrwxrwx 1 Mandar None 9378192 May 5 01:21 Employees9.csv
-rwxrwxrwx 1 Mandar None 28134576 May 5 01:21 Employees10.csv
-rwxrwxrwx 1 Mandar None 140672880 May 5 01:22 Employees11.csv
-rwxrwxrwx 1 Mandar None 422018640 May 5 01:22 Employees12.csv
-rw-r--r-- 1 Mandar None 8314 May 5 12:20 MyNewFile.csv
-rw-r--r-- 1 Mandar None 11 May 5 12:34 helloworld.txt
-rw-r--r-- 1 Mandar None 8185 May 5 17:30 Dep10File.csv
-rw-r--r-- 1 Mandar None 8197 May 13 20:43 1.csv

27.sort files by size
------------------------

ls -al | sort +4n

This command performs a numeric sort on the fifth column of the "ls -al" output.
This results in a file listing where the files are listed in ascending order, from smallest in size to largest in size.

28.grep
-----
grep [options] pattern [files]

-i Ignore case sensitivity.
-w Match whole word.
-n line number
-c count

commnad to find a word "Error" in file1 and print it to file2 : grep Error file1 >> File2.txt
ps -ef | grep sleep will display all the sleep processes running in the system as follows.


Mandar@Mandar-PC /MyFiles/NewFolder
$ cat > GrepTest.txt
Mandar R. Gogate

Mandar@Mandar-PC /MyFiles/NewFolder
$ grep mandar


Mandar@Mandar-PC /MyFiles/NewFolder
$ grep mandar GrepTest.txt

Mandar@Mandar-PC /MyFiles/NewFolder
$ grep -i mandar GrepTest.txt
Mandar R. Gogate

Mandar@Mandar-PC /MyFiles/NewFolder
$ grep -in mandar GrepTest.txt
1:Mandar R. Gogate


Mandar@Mandar-PC /MyFiles/NewFolder
$ grep Mandar GrepTest.txt
Mandar R. Gogate

Mandar@Mandar-PC /MyFiles/NewFolder
$ grep -w Man GrepTest.txt

Mandar@Mandar-PC /MyFiles/NewFolder
$ grep Man GrepTest.txt
Mandar R. Gogate


Mandar@Mandar-PC /MyFiles/NewFolder
$ grep -ci man GrepTest.txt
1

Mandar@Mandar-PC /MyFiles/NewFolder
$ grep -i man GrepTest.txt > Out.txt



29.find
----

find : finds a file in a directory

-mtime is the last modified time of a file

-atime is the last access time

1. To find files modified in the last 5 days:

find . -mtime -5


2. To find files modified before 5 days:

find . -mtime +5

->If you omit the '+', it has a different meaning. It means to find files modified exactly before 5 days.



File got modified in last day
Mandar@Mandar-PC /MyFiles/NewFolder
$ find -mtime -1
.
./GrepTest.txt
./Out.txt
./Way1.txt
./Way2.txt
./Way3.txt


Files got accessed in last day
Mandar@Mandar-PC /MyFiles/NewFolder
$ find . -atime -1
.
./GrepTest.txt
./Out.txt
./Way1.txt
./Way2.txt
./Way3.txt


File got accessed before 45 days
Mandar@Mandar-PC /MyFiles/NewFolder
$ find . -atime +45
./Employees.csv




30.Creating Files
===========================

CAT
TOUCH

CAT > Filename, CTRL+D

> Create
< View
>> appends

Touch: this command creates an empty file

31.Chmod command.
-------------
chmod command is used to change permissions on a file.
for example if I have a text file with calender in it called cal.txt.
initially when this file will be created the permissions for this file depends upon umask set in your profile files.
As you can see this file has 666 or -rw-rw-rw attributes.

ls -la cal.txt

-rw-rw-rw- 1 ssb dxidev 135 Dec 3 16:14 cal.txt

In this line above I have -rw-rw-rw- meaning respectively that owner can read and write file, member of the owner's group can read and write this file and anyone else connected to this system can read and write this file., next ssb is owner of this file dxidev is the group of this file, there are 135 bytes in this file, this file was created on December 3 at time16:14 and at the end there is name of this file. Learn to read these permissions in binary, like this for example Decimal 644 which is 110 100 100 in binary meand rw-r--r-- or user can read,write this file, group can read only, everyone else can read only. Similarly, if permissions are 755 or 111 101 101 that means rwxr-xr-x or user can read, write and execute, group can read and execute, everyone else can read and execute. All directories have d in front of permissions. So if you don't want anyone to see your files or to do anything with it use chmod command and make permissions so that only you can read and write to that file, i.e. chmod 600 filename.

r -4
w -2
x -1

32.File Operations
--------------------------

33.Wc command
------------------
wc command counts the characters, words or lines in a file depending upon the option.

* Options wc -l filename will print total number of lines in a file.
* wc -w filename will print total number of words in a file.
* wc -c filename will print total number of characters in a file.

34.Mv command.
------------------
mv command is used to move a file from one directory to another directory or to rename a file.

* Some examples: mv oldfile newfile will rename oldfile to newfile.
* mv -i oldfile newfile for confirmation prompt.
* mv -f oldfile newfile will force the rename even if target file exists.
* mv * /usr/bajwa/ will move all the files in current directory to /usr/bajwa directory.

35.Rm command.
------------------
To delete files use rm command.

* Options: rm oldfile will delete file named oldfile.
* rm -f option will remove write-protected files without prompting.
* rm -r option will delete the entire directory as well as all the subdirectories, very dangerous command.

36.Rmdir command.
------------------
rmdir command will remove directory or directories if a directory is empty.

* Options: rm -r directory_name will remove all files even if directory is not empty.
* rmdir sandeep is how you use it to remove sandeep directory.
* rmdir -p will remove directories and any parent directories that are empty.
* rmdir -s will suppress standard error messages caused by -p.

37.Cut command.
------------------
cut command selects a list of columns or fields from one or more files.
Option -c is for columns and -f for fields. It is entered as
cut options [files]
for example if a file named testfile contains

this is firstline
this is secondline
this is thirdline

Examples:
cut -c1,4 testfile will print this to standard output (screen)

ts
ts
ts

It is printing columns 1 and 4 of this file which contains t and s (part of this).

* Options: -c list cut the column positions identified in list.
* -f list will cut the fields identified in list.
* -s could be used with -f to suppress lines without delimiters.

38.Paste Command.
------------------
paste command merge the lines of one or more files into vertical columns separated by a tab.
for example if a file named testfile contains

this is firstline

and a file named testfile2 contains

this is testfile2

then running this command
paste testfile testfile2 > outputfile
will put this into outputfile

this is firstline this is testfile2

it contains contents of both files in columns.
who | paste - - will list users in two columns.

* Options: -d'char' separate columns with char instead of a tab.
* -s merge subsequent lines from one file.

Mandar@Mandar-PC /MyFiles/NewFolder
$ paste GrepTest.txt Way2.txt > PasteTest.txt

Mandar@Mandar-PC /MyFiles/NewFolder
$ cat PasteTest.txt
Mandar R. Gogate Hello World

39.Cmp command.
-----------------
cmp command compares the two files.

40.Sort command.

------------------
sort command sort the lines of a file or files, in alphabetical order. for example if you have a file named testfile with these contents

zzz
aaa
1234
yuer
wer
qww
wwe

Then running
sort testfile
will give us output of

1234
aaa
qww
wer
wwe
yuer
zzz

* Options: -b ignores leading spaces and tabs.
* -c checks whether files are already sorted.
* -d ignores punctuation.
* -i ignores non-printing characters.
* -n sorts in arithmetic order.
* -ofile put output in a file.
* +m[-m] skips n fields before sorting, and sort upto field position m.
* -r reverse the order of sort.
* -u identical lines in input file apear only one time in output.


41.Head command.
------------------
head filename by default will display the first 10 lines of a file.
If you want first 50 lines you can use head -50 filename or for 37 lines head -37 filename and so forth.

42.Tail command.
------------------
tail filename by default will display the last 10 lines of a file.
If you want last 50 lines then you can use tail -50 filename.

43.Crontab command.
------------------
crontab command is used to schedule jobs. You must have permission to run this command by unix Administrator. Jobs are scheduled in five numbers, as follows.

Minutes 0-59
Hour 0-23
Day of month 1-31
month 1-12
Day of week 0-6 (0 is sunday)

so for example you want to schedule a job which runs from script named backup_jobs in /usr/local/bin directory on sunday (day 0) at 11.25 (22:25) on 15th of month. The entry in crontab file will be. * represents all values.

25 22 15 * 0 /usr/local/bin/backup_jobs

The * here tells system to run this each month.
Syntax is
crontab file So a create a file with the scheduled jobs as above and then type
crontab filename .This will scheduled the jobs
----------------------------------------------------------------------------

->Mail command is used to send mails.


Sunday, April 29, 2012

Shell Scripting - First Day !!

Aftar long duration, writing this blog on Shell Scripting.. somethinug which is still a new thing for me being in IT industry for more than 5 yeras now. To begin with we will execute HelloWorld program as per the tradition; but before that below is the link which I am using for practicing

http://www.freeos.com/guides/lsst/ch03sec01.html

I have installed cygwin for practicing shell and unix commands, though vi do not work here, you can always type your commands on windows machine [notepad, wordpad or notepad++] and execte those with cygwin. Yes, you can access windows files from cygwin.

First Program - HelloWorld

------------------
#!/bin/sh
#
#First Script of Hello World
echo "Hello World"
------------------
Just run above command on command prompt as below
./HelloWorld.sh


Second Program - Print Contents Inside Files

-----------------------
#!/bin/sh
#
#Second Program to print the content of file
if cat $1
then
echo -e "\n\nFile $1, found and successfully echoed"
fi


Wrote above script in notepad++ ; but when tried to execute that from command prompt, it failed giving below error
./Second.sh: line 9: syntax error: unexpected end of file

But, when I wrote same command from command line itself, it got executed successfully.. Need to know the reason behind this.

Detailed explanation
if cat command finds foo file and if its successfully shown on screen, it means our cat command is successful and its exist status is 0 (indicates success), So our if condition is also true and hence statement echo -e "\n\nFile $1, found and successfully echoed" is proceed by shell. Now if cat command is not successful then it returns non-zero value (indicates some sort of failure) and this statement echo -e "\n\nFile $1, found and successfully echoed" is skipped by our shell.


Now here you will also require a echo parameter description which are as follows

echo [options] [string, variables...]
Displays text or variables value on screen.
Options
-n Do not output the trailing new line.
-e Enable interpretation of the following backslash escaped characters in the strings:
\a alert (bell)
\b backspace
\c suppress trailing new line
\n new line
\r carriage return
\t horizontal tab
\\ backslash

For e.g. $ echo -e "An apple a day keeps away \a\t\tdoctor\n"

Saturday, March 3, 2012

Creating dummy data !!

Just going through oracle blogs, and got focus on below site...http://www.oracle-developer.net/display.php?id=515

Most of the times when you are practicing Oracle or Informatica code, you require a dummy data to be available at table. Particularly when you are working on Performance testing of oracle and informatica, large data is required. How will you create it ? One way is insert some records, then take union of it and again union of it. and so on.. But this will create repetation of data in your table..

Best way is below


Create table test as
SELECT ROWNUM                     AS id
       ,      MOD(ROWNUM,2000)           AS grp
       ,      DBMS_RANDOM.STRING('u',5)  AS val
       ,      DBMS_RANDOM.STRING('u',30) AS pad
       FROM   dual
       CONNECT BY ROWNUM <= 1000000;



Friday, February 17, 2012

Saving an error while DML statement...

It was an interview times, as our project went to tcs... every one was either facing some interview or preparing for the same.

Birwatkar was facing interview at CnC, and below was one of the question asked to him..

How to log errors while performing DML Operations through sql ?



1. First create source and target table
create table insertTestSourec
(
    id varchar(200)   
)


create table insertTestTarget
(id number
)

2. Create error log table using proc create_error_log
BEGIN
  DBMS_ERRLOG.create_error_log (dml_table_name => 'insertTestTarget');
END;

insert into insertTestSourec values(1)
insert into insertTestSourec values(2)
insert into insertTestSourec values('t')


insert into insertTestTarget
select * from insertTestSourec

Here, statement will get rollback and not a single record will get inserted into target table.

Check your error log table using below sql

select *
from all_tables
where table_name like upper('%insertTestTarget%')

3. Verify the error log table
insert into insertTestTarget
select * from insertTestSourec
LOG ERRORS INTO ERR$_INSERTTESTTARGET ('INSERT') REJECT LIMIT UNLIMITED;

4. Here, after execution of above statement, 2 records (1 and 2) will get inserted into target table and incorrect record will get inserted into ERR$_INSERTTESTTARGET.

select *
from ERR$_INSERTTESTTARGET

Thursday, January 5, 2012

Comparing Map and Folders
Ignore In Comparison at Lookup and tranferring that to other folder while migration
With clause query of date range for previous weeks monday - friday

All about CSS

From book HTML & CSS - Design and Build Websites - Jon Duckett CSS works by associating rules with HTML elements. These rules govern how...