While it may be easy to create SSIS packages or project, creating a scalable ETL solution is another story. When I do SSIS presentations or discuss with ETL developers, it seems that there are many ways to build an SSIS application. When we talk about best practices around SSIS architecture, I often get this answer: “We did it like this and it work fine in production”. While I think what I in production work OK, it does not mean that the application being built is made upon best practices and makes it scalable enough. As an example, I always ask these questions:
– Are you able to tell which packages executed, how many rows have been selected, inserted, updated or deleted?
– How much time it took to run a specific package?
– Which task inside a package executed the longest?
– How do you record errors and warnings?
For me, answers to these questions represent basic ETL Framework function that should be implemented into almost every SSIS applications. How ETL solution should be architected around this kind of framework would require a book on these topics. Earlier this week, I purchased a book that is specifically written to answer most of the above questions. The authors of the book are what I should call Subject Matters Experts on SSIS. They talk about real life scenarios, performance monitoring, framework design, etc. Here is the link to the book: SQL Server: Microsoft SQL Server 2008 Integration Services Problem-Design-Solution – Book Information and Code Download – Wrox.
So, beyond SSIS packages, we now have a reference on how an application should be structured. Over the years, I found on the web almost all informations that are in this book and I have been fortunate enough to be able to grab and implement almost all best practices in the book. While I have minor questions or arguments on some subtopics, I think that this book should be one of the most useful on your shelve.
Christian
Leave a Reply