I would agree wholeheartedly that the best way to filter unoptimized programs is to run a ton of cases and cases with high n.
There is a rough way of determining time complexity though: Because time complexity is asymptotic, once input is sufficiently big you can test how the runtime scales with input size.
Or, of course, you can just make really huge inputs and perform a stress test with time/memory constraint. Then the subpar algorithms will be choked to death :P
Time complexity is not determined by a test. It's generally determined by a human.
Your time complexity is determined by how many loops you need to do and what you do in each loop. If you do a loop over an array once, you have O(n). If you loop over an array for each element in the array, you have O(n^2). Sorting an array with quicksort usually takes O(nlogn) (log base 2), but bubble sort takes O(n^2). Generally, pick the average case, but there are situations in which you want to report the worst case.