Thursday, December 26, 2013

Identity overlap

Have you ever stopped to think why on earth do you remain so much time in a job? Why are you faithful to a company or brand? Is it the money? Is it the challenge? Stability? Ego?

Several studies indicate that the more you relate with the company the better you perform. Taking on the study by Bergami and Bagozzi (2000) about Self-identity versus Group Identity, take a look at the picture below. How related are you to your company? How many self characteristics map to characteristics from your enterprise?


(click to enlarge)


Just like everything in life, the edges can be dangerous if you're not conscious of where you stand and you don't know how to take advantage of that "position". For instance, if you're at "Level 0", you certainly should have changed job a long time! The other edge is a cautionary one. You might LOVE the company, the job, everything is perfect. But what if tomorrow something goes wrong? Will it be like a fallen marriage? What if someone offers you an obscene amount of money? Will it be passion over money?

Nevertheless, I like the "Level 7" edge.
Feel it! Believe in it! Love it! Give everything. Get Involved!
If one day you no longer feel the job, change what's wrong! Change the job, change yourself or change job! It's like running or a soccer match: You may only stand for 30 minutes, but give it all in those 30 minutes. If you can only breathe for 20, cool: die on those 20, but give it all you got!


Thursday, December 5, 2013

The dangers of not being SMART

"Reaching a goal you don't have is as hard as returning to a place you've never been" (Zig Ziglar). Setting objectives is fundamental in order to excel yourself. If you never set a goal, you'll never be able to go beyond that line! Also, objectives are motivational. They:

    Set focus to relevant actions to perform a task;
    Conduct to higher effort levels;
    Stimulate the development of long-term plans to reach that goal;
    Define time frames;
    Brutally increase failure resistance;
    Tend to reduce procrastination;
    Allow more accurate feedback;


Having said that, setting goals should follow the typical "SMART" Criteria:

 – Specific: What exactly are we going to do for whom? Target a specific area for improvement. This should be as clear as possible, not vague;
 – Measurable: Is it quantifiable and can WE measure it? Concrete criteria for measuring progress towards the attainment of the goal;
 – Attainable: Can it be done? Choosing goals that are achievable and realistic;
 – Relevant: Will this objective have the desired effect ?
 – Time-bound: When will this goal be accomplished? Grounding goals within a time frame.


One of the most common mistakes I see nowadays in the Software Development industry falls into the "A" fail: Realistic! I know that due to time and budget constraints we have to be specially thorough when it comes to investments, but setting a goal for a project or an individual that will only be achievable if and only everything goes perfect is a truly horrific bad practice. In Software Development, nothing is ever perfect and the estimates should always be based upon a three-point estimation: Worst case, expected and best case scenario.

However, the setting of unrealistic goals isn't just bad by the direct outcome: it will impact on everything a team does! Another real danger is that the team spirit will be doomed! How can you give feedback to someone if by setting unrealistic goals the team member will always be failing "deadlines"? How can you demand high quality from a team member that lacks motivation, time, resources...?


So, do the right projects and do them right!


Be ambitious. Be Wise along the way...

Friday, November 15, 2013

Is Activator that bad?

When developing Software, often comes a point where you have to make the typical "Generic versus performance" decision. Developing generic code usually has a tradeoff: performance loss. For example, the use of Reflection has an extra overhead. Sometimes you don't need to build it generic, sometimes you have to do it (plugins are the most obvious example) and other times you have the possibility to choose either way. In my case, the decision is simple: Building it generic simplifies maintenance and future developments but has the performance overhead. Is that overhead relevant in the overall performance? What's the weight? Will the code be "usage intensive" or the overhead is irrelevant considering all the constraints? Of course if you get to this point of "tuning", you probably have already fine-tuned a thousand other places before reaching here, but nevertheless...

Enough talking. Let's check it out!


What we will be testing

Before we start: I deleted code comments on purpose! This is the class that will be used.
Just that simple!


Now lets take a look at 5 ways to create "myClass" objects:
    Using the common "new" keyword;
    Activator.CreateInstance (using the full name of the type);
    Activator.CreateInstance (using type information directly);
    Using the ILGenerator;
    Using Lambda Expressions.

The first one is focused on "compile-time" as opposite to the others whose focus is on runtime. Our goal here is to verify the performance of each one of these ways to create new objects. The first three are quite simple:

(click to enlarge)


Nothing to explain here, just pure and simple Framework stuff. The DynamicInitializer is an improved version (using a "CachedStyle") of the code provided by Dean Oliver @ CodeProject, which uses an ILGenerator:

(click to enlarge)


The ObjectGenerator is "heavy work", so, we used "a cached version". The Lambda expressions version is adapted from Roger Alsing and it's pretty cool:

(click to enlarge)


However, just like in the ILGenerator's version, it's only useful if we have a scenario where you pre-compile the lambda expressions once and then create several objects. The first run of this version is also heavy work!


Hardware

Three machines used:
    Intel Core 2 DUO @ 3Ghz using WindowsXP. CPU Usage averaging 5%;
    Six-Core AMD Opteron Processor 2427 @ 2.20Ghz (4 processors) using Windows Server 2008 R2 Enterprise. CPU Usage averaging 3%;
    Intel Core 2 DUO @ 2Ghz using Windows 8. CPU Usage averaging 16%.

(Ignoring the machines uptime. The tests ran locally, no network issues involved. Beware of 32 bits versus a 64 bits architecture)


How

Note that for the purpose of these tests, we're using just parameterless constructors. The performance test was done by executing 100 tests consisting of 1 million iterations per test on 3 different machines. Meaning:

100 tests x 1 million iterations x 3 machines = 300 million objects

So, each of the 5 methods shown above was invoked 300 million times.



The Results

Here's the results. The Average Time Wasted is per 1 million "invokes"! In the graphic, I omitted the "ActivatorByName" since it would make the graph less readable.



Performance

So, "no brainer": The direct and simple new operator is the quickest. The main reason is obvious: most of the "work" was done at compile time rather than runtime. If you look at the generated IL, all the OpCodes were emitted in place and no reflection is used:


Now, the second best shot is using the ILGenerator. However, this is only valid due to the use of the "Cached" version. You might have noticed that in the "for" cycle, we are using the "Cached" version, meaning, a previous call to the ObjectGenerator method was made in order to actually emit the IL. If we were to call ObjectGenerator on every cycle iteration, the performance would actually be the worst and outrageous! What this means is that this implementation is only good for this particular scenario in which we have several object creations in the same scope. The Lambda Expressions suffers from the same issue. It's the next best shot but, just like the ILGenerator version, we have to ensure only one compilation of the Lambda Expression and several object creations in the same scope. You can see more info here.

Now, the ActivatorByType was approximately 10 times worse than the basic "new". This is because most of the work is now done at runtime using Reflection and not at compiled time like when using the "new" keyword. If you look at an implementation of the Activator.CreateInstance() you can see this more clearly:

(click to see the full version)


This is an implementation from the Shared Source Common Language Infrastructure (aka SSCLI20 (Rotor)). Finally, the CreateInstance by name. It's the worse because it has even more work to do under the hood (See the Activador.cs and RRType.cs from the SSCLI20 mentioned above).


Conclusions

The good news!
First of all, keep in mind that our worst case scenario here is 11173ms. That's roughly eleven seconds per 1 million objects! Wouldn't it be awesome if you could earn 1 million euros in 11 seconds? Second, if you're reaching this point of fine-tuning, you're either building an extremely high-performance, multithreaded, extensible, expansible and scalable piece of software or you should definitely be searching performance improvements somewhere else.




Sunday, November 3, 2013

Shopping behind the scenes!

When shopping at ebay or making a simple phone call, have you ever wondered "these actions are generating several tupples in a Database, going through an ETL process, filling DataMarts, Data Warehouses"... on and on and on! OLAP, MOLAP, ROLAP, DataMining, MDX queries, cubes... It's a world! We even have some buzzwords like "bigdata" and "semantic data models".

Talking about ebay, here's a "relaxed reading" about their data wharehouse:



"The e-commerce giant stores almost 90PB of data about customer transactions and behaviors to support some $3500 of product sales a second."


30 years of MSPress

I like MSPress books. Guess what, they are celebrating 30 years!

See the post HERE.


Wednesday, October 30, 2013

Provider Microsoft.Jet.OLEDB

In the past couple of years, I've worked with several DBMS: Oracle, SQLServer, MySQL, DB2, SyBase, PostgreSQL and MSAccess. And in several "flavors". I even worked with MySQL when it had no Stored Procedures support! Which one is the best? Well, as most answers in the IT world: it depends. Just like that! However, the most "Buggy" to me is undoubtedly MSAccess. And if the idea is just to have the database file along side a simple WPF app and connect to it using JET OleDBProvider, then it gets even worse!

Here goes a simple one:
      ISNULL is a Built-in function available in SQLServer. So, one might wonder: MSAccess is just another product in the Microsoft product line. It probably has this. Well, no, it doesn't. It has the NZ function. It does have IsNull, but with a different meaning (VB Style).

So far, I'm cool with it. But if you're going to connect using Microsoft OLEDB Provider for Microsoft JET, the "NZ" function is not available. At the best, you have the "ugly" IIf. Don't like!
I know that MSAccess is from another Microsoft's product family and, for historical reasons, it has its "VBLook", but damn it's annoying!


More info:
Cool article on some MSAccess "Common Query Hurdles";
Yet another bug list: Allen B. tips for Microsoft Access (Scroll down to the "Flaws in Access")
(I've already had the displeasure to see some of them live!)


Tuesday, October 15, 2013

MJIT – Multicore Just-In-Time Compilation

Managed Execution Environments brought us a new world. The advantages are priceless and this paradigm changed the way we develop software solutions. However, it does have a downside: It adds overhead. For extremely high performance software (real-time critical), you would still write native apps, but as time passes and hardware capabilities increase, the overhead will be irrelevant. Nevertheless, improvements to the way the execution engine works are always welcome. Putting aside the NGEN feature, the .NET 4.5 introduced Multicore Just-in-time compilation, "which uses parallelization to reduce the JIT compilation time during application startup."

Check it out:

One cool detail:
"Multicore JIT is on by default for Silverlight 5 and ASP.NET applications, but not for desktop applications. The main reason for this is that the CLR needs a place to save and load the files containing the JIT profile information. Silverlight 5 and ASP.NET applications are hosted applications and the host provides a good place to store the profile information. We also know that all of these applications have a similar startup path and will be able to take advantage of MCJ. For desktop applications we don't have a good location to put the profile, and not all applications will benefit from MCJ on process startup. For desktop applications, we chose to provide a set of APIs that can be used to "opt-in" if it will benefit your application."

More resources:
Check Vance Morrison (.NET Performance Architect) and Dan Taylors (Multicore JIT program manager) interview on Channel9.


Saturday, July 27, 2013

About Green Code

Commenting code is one thing. Documenting it is a completely different thing. And both have two categories: useful and useless. Let me show you what I'm talking about.

1 – Commenting Code

        Useless


Why on earth would you do something like this? If "x" and "y" are never used, cool: remove the line! Also, commenting with // and then /**/ is annoying! Do not mix the way you comment lines. Use a pattern! You can use both ways for commenting, but just follow some guidelines! At the limit, it "blows" the shortcuts in VisualStudio.


        Useful


Ok, somewhere in time that HasProfile stuff was used. Or It probably will be. Or it makes sense. Or the client was not sure if that should be a rule or not and wanted to see how males interact with some feature but not sure if it will be a definite one or the limitation on female will be back. Cool, don't delete. Avoid rewriting everything again! At the limit, you could delete it (that's why we have Source Control Systems), but it's easier this way.


2 – Documenting Code

        Useless


Cool! This is the worst! Code is self-explanatory and if you're not going to add any extra valuable information, then don't write anything. It only makes it less readable.

        Useful

(click to enlarge)


Green code is not about having lots of green color on your screen. It's about having useful, meaningful and extra-valuable information placed in specific (strategic) spots that can make a difference and be helpful in the future.

Tuesday, May 28, 2013

String Pooling – are you immutable?

Last night I was doing some performance improvements to an already ultra light application with a co-worker. The focus was to reduce as much as possible the Memory FootPrint. In the middle of this work, somewhere in the code there was something like:

(click to enlarge)


Ok, it was more "string intensive", but lets keep it simple. Lets look to this example. What's "wrong" here? Well, just a detail:

(click to enlarge)


Ok. They hold the same text BUT they actually "reference" two different objects. Strings are immutable, so far so good, here's the output:

(click to enlarge)


BUT, what happens if you have something like:

(click to enlarge)


Here's the output:

(click to enlarge)


So, firstString and secondString are not only equal in value but reference the same object. How is this possible? This is actually not a .NET related stuff. Lots of unmanaged compilers have been doing this for ages. Quoting Jeffrey Richter:

"When compiling source code, your compiler must process each literal string and emit the string into the managed module's metadata. If the same literal string appears several times in your source code, emitting all of these strings into the metadata will bloat the size of the resulting file. To remove this bloat, many compilers (including the Microsoft C# compiler) write the literal string into the module's metadata only once. All code that references the string will be modified to refer to the one string in the metadata. This ability of a compiler to merge multiple occurrences of a single string into a single instance can reduce the size of a module substantially."

Just another detail. But a cool one!

Final note: Some people refer to this as String Interning. However, be careful when using the String.Intern method directly. It does have bad performance side effects (refer to the section "Performance considerations" in the MSDN documentation.

Thursday, February 14, 2013

SQL CLR – Supported .NET Framework Libraries

One can discuss if SQLCLR is an important feature or not. My opinion? Yes, it might be useful in some specific scenarios BUT I still say that the Data is the most valuable part of any system (or business) and you'll want it to be as much controlled, secured and closed as possible. And I'm a .NET enthusiastic...

However, If you do need to use SQLCLR be aware that there is a limited list of supported assemblies that you can use. Yes, you can load any assembly that you want, BUT, if it's not in the approved list, "you are now shifting the responsibility of managing those assemblies to *yourself*, the SQL Server application programmer and/or DBA". And this is no problem if it's your assembly, after all, it's your code. But you also become responsible for some of the Microsoft’s Assemblies not in the supported list! Did you know that System.DirectoryServices.dll is not on that list?

You can see a List of the supported assemblies HERE;

One other important issue has to do with the versioning. Quoting Microsoft, "servicing or upgrading libraries in the GAC does not update those assemblies inside SQL Server. If an assembly exists both in a SQL Server database and in the GAC, the two copies of the assembly must exactly match. If they do not match, an error will occur when the assembly is used by SQL Server CLR integration." And by "must exactly match", it includes MVIDs.

"When the CLR loads an assembly, the CLR verifies that the same assembly is in the GAC. If the same assembly is in the GAC, the CLR verifies that the Module Version IDs (MVIDs) of these assemblies match. (...) When an assembly is recompiled, the MVID of the assembly changes. Therefore, if you update the .NET Framework, the .NET Framework assemblies have different MVIDs because those assemblies are recompiled. Additionally, if you update your own assembly, the assembly is recompiled. Therefore, the assembly also has a different MVID."

So, final conclusion: what this means is that if you're using unsupported assemblies "machine widely", any update in the GAC must be aware that it might "break" something if also used inside SQLServer.


Here's a perfect writing on this issue:
And Microsoft's KB Article:

Saturday, January 12, 2013

Time is money (Personal Working Processes)

On my previous post, I stated that ".NET related tasks represent something like 10% of my working time". A blog reader (and co-worker) asked me how did I know this.

Well, it was January 2011 and I was drowned in work. I had so many tasks to do, so many deliverables and so many milestones to achieve that I was completely unorganized. All my days ended at around 3 a.m. and it always felt like the day wasn't that productive. So, I developed a small stats app to know where exactly was I wasting time. Based on the Windows Processes, the app registered all my movements (How much time was I using Word, Excel, Powerpoint, Visual Studio, SQLServer, MySQL, MSProject, PHP development, PHC, on and on...).

The goal? Efficiency. Based on the statistics gathered, I was able to extract valuable information about my working habits, some of them not so good. How much time do you spend browsing the internet between tasks? And on Windows explorer organizing your hard drive because you were lazy when downloaded something? And searching the hard drive? And toggling between MSProject, Word and SQLServer?

Since then, I've made several improvements to my daily work. The most visible improvement is a basic personal ticketing system (prioritizing tasks is invaluable).