Skip to main content

Talend ETL Part 1: SQL Server Database to Excel Sheet


Hello All,

Of many ETL tools available in Market, One of the strong tool is Talend. Difference between other ETL tools and tools like Pentaho, Talend, Clover ETL, Adeptia Integration etc, is that they support NO SQL Cross domains, BIG Data, Hadoop etc.

Other ETL tools like, SSIS, Informatica are now coming with their higher versions, which consists of Hadoop Integration.

Basically We can say, there are two databases types.
1) RDBMS (Example: SQL Server, MySQL, Oracle etc)
2) Non RDBMS (Example: MongoDB, InfiniDB etc)

Talend Supports Non RDBMS databases.

Here I would like to share my hands on experience on Talend and how to use it and explain basic components of Talend.

Approx there are 500 components we can find in Talend.

So lets Kick Start from Basics. First lets try to load Data from Microsoft SQL Server to Excel.

Steps:

Step1: Open Talend Studio.



Step 2: Right click on Job Design and Create a new Job by giving some job name.



Step 3: Give the name of Job and click Finish.



Step 4: To connect to Database, Click on DB Connections and Create New connections.




Step 5: Next, give the New Database Connection Name and then click Next.




Step 6: Select Database Type you want to connect using Drop Down button.



Step 7: I have already created a ODBC Driver of MSSQL so I am using the same. You may can take OLEDB connection also. Give the connection details and Check whether connection got established  and then give finish. 




Step 8: Now Under MetaData, We can see the a DB Connection been created under the name Twitter_Data. Now right Click on DB Connection and click On Retrieve Schema.





Step 9: This window pops out, and  Give Next.




Step 10: Now under the database, Expand it and Select the Schema in which your Table exists and under Schema, Select the Table which is required.





Step 11: 




Step 12: After giving finish, we can see the Schema is Created Under MetaData. Now we can drag and drop this component on the Designer Page.




Step 13: Now drag and Drop the Employee Schema onto the Designer. When we drag and drop it, this following window will pops out. Click tDBInput.





Step 14: 




Step 15:   Now  on Palette on Right hand side, Select Files and Select Output.




Step 16:  Now under Files>Output, Select tFileOutputExcel component and Drag and Drop it on Designer Window.





Step 17: After Drag and Drop, Now right Click on EMPLOYEE Component on Designer Mode and Select Row>Main. This will produce a thread/Line which is used to connect the components. Now drag thread and place it over TFileOutputExcel.




Step 18: 


Step 19: Now to configure tFileOutputExcel, Click on the component once and Under Component Properties below (As Shown), Fill the areas which are highlighted which gives details like Path of the Excel Destination File, header Details etc.





Step 20: Now its all done. Now we are ready to run the package. To run the Job, We can Select Run Tab below and can use Run Button to run full Component under Basic Run.





Step 21:  Else to Debug and watch each row by row, We can select DeBug Run under RUN tab and Can run the package.






Comments

  1. Yes I agree with you on One of the strong etl tool is Talend.thank you for explaining the difference between etl tools and other.This post give me knowledge to do more in my Talend ETL online Training in Hyderabad.

    ReplyDelete
  2. This post is much helpful for us.Excellent Blog! This is very helpful for who wants to learn professional Education. Please keep Going. These concept is a good way to enhance the knowledge. I like it and help me to development very well. Thank you for this brief explanation and very nice information. Well, got a good knowledge.
    Talend Tutorial

    ReplyDelete
  3. The information you provided in this Blog is very useful. The information is worth and very useful for the beginners. I like your way of writing and explaining the topics. If You Want More Details About Talend Click here.

    ReplyDelete
  4. Thank you for the useful information. Share more updates.
    Learn Cognos Online
    cognos online training

    ReplyDelete
  5. Nice blog, very informative content.Thanks for sharing, waiting for the next update…
    Hibernate Training in Chennai
    Spring and Hibernate Training in Chennai

    ReplyDelete
  6. Article, its very informative content..thanks for sharing...Waiting for the next update…
    What is Manual Testing
    Top 5 Manual Testing Tools

    ReplyDelete
  7. Great blog. Thanks for sharing such a useful information. Share more.
    Pytest Online Course
    Pytest Online Training

    ReplyDelete

Post a Comment

Popular posts from this blog

Zip/Unzip multiple files and also include password for zipped file using SSIS

We have many scenario that we need to Zip many files which we come across and then so some operations like either sending it as a email or just moving zipped file to some other destinations etc. But we were using manual method to zip multiple files. In this post, I tried to create a package which will zip multiple files using SSIS. Here for Zipping files purpose, I'm using 7-ZIP which is free software available in google sites. Download files and install onto your system. First let me show how to Zip on file and later I will show how to zip multiple files using SSIS and 7Zip tool. Compressing Single file. Here I'm trying to Zip one single flat file which is of 40MB size. I kept this file in C:\Documents and Settings\\Desktop\test\source folder. Now to compress this file, I will open my SSIS and I'm dragging and dropping EXECUTE PROCESS TASK from Control Flow. Now right click on Execute Process task and go for edit and select Process option. In process tab,

SSIS: The Value Was Too Large To Fit In The Output Column

I had a SSIS package where I was calling a stored procedure in OLEDB Source and it was returning a “The Value Was Too Large to Fit in the Output Column” error. Well, My Datatype in OLEDB source was matching with my OLEDB Destination table. However, when I googled, we got solutions like to increase the output of OLEDB Source using Advanced Editor option . I was not at all comfortable with their solution as my source, destination and my intermediate transformation all are having same length and data type and I don’t want to change. Then I found that I was missing SET NOCOUNT ON option was missing in Stored Procedure. Once I added it, my data flow task ran successfully. 

How to move multiple files in ssis and also rename simultaneously

There are two ways to achieve this. 1) We can move the flat files and then rename it. 2) While moving files itself, automatic rename should be done. We will do the second type. The criteria is to rename the files while moving from source to destination. So for that, we need FILE SYSTEM TASK to be included. Secondly since we need to move many files, we will use FOR EACH LOOP CONTAINER. To fetch all the files, we can use FOR EACH LOOP task in SSIS. In collection tab, we can select FOREACH FILE enumerator option for fetching files and we can change enumerator configuration Folder option: Points to source where we need to fetch files. Files: will give us idea whether we need to fetch all the files (*.*) or if we give extension like *.txt, it is going to fetch only  .txt files . Once I give Source name in FOR EACH LOOP container, It is going to fetch all the files corresponding to that path. Retrieve file name: This option is used to let the variables mentioned in VARIA