DBSTalk Forum banner
Status
Not open for further replies.
1 - 16 of 16 Posts

·
AllStar
Joined
·
68 Posts
Discussion Starter · #1 ·
From what I read here and my experience with the HR20 points to software more than hardware as an RBR seems to fix the problem... albeit temporarily. The 13 updates also suggest D* is trying to fix the problems with software and they are not calling a hardware recall on these boxes. Hardware failures are for the most part fatal and not fixed with a software reboot.

In the computer hardware/software configuration in my line of business all have a built in “post mortem” feature which essentially writes to the hard drive what the micro-processor was doing with it’s registers, buffers...etc and the code it was running on real-time and available when a failure occurs.

Considering the diverse number of problems, the number of people having them and D* inability to fix the problems so far, why doesn’t/can’t they include some sort of “post mortem” feature in the next update. The “post mortem” data can be downloaded to D* via the telephone as part of the RBR? This could tell them what was happening when the customer had to reset the box.

This hit and miss, 13 updates really says volumes on what kind of grasp D* has on what the problems are. I really don’t want to wait for update 27 when all the bugs are finally fixed.
 

·
Lifetime Achiever
Joined
·
30,092 Posts
I am sure they could...

But then you could get into more of the "legal" mubo jumbo...

What are you sending...
I didn't say you could use my phone line for that...
I don't want them to have that data...
ect...

Very similar to the same nonsense you see about people complaining about the Microsoft "Do you want to send error report data"
 

·
DBSTalk Club Member
Joined
·
146 Posts
Earl Bonovich said:
I am sure they could...

But then you could get into more of the "legal" mubo jumbo...

What are you sending...
I didn't say you could use my phone line for that...
I don't want them to have that data...
ect...

Very similar to the same nonsense you see about people complaining about the Microsoft "Do you want to send error report data"
Could they enable a "Santa" type feature for this? I would opt in on a program like this if it would help the developers out.
 

·
Lifetime Achiever
Joined
·
30,092 Posts
Dave_S said:
Could they enable a "Santa" type feature for this? I would opt in on a program like this if it would help the developers out.
I don't know... but when my contacts get back into the office... I'll ask.
 

·
DBSTalk Club Member
Joined
·
1,364 Posts
a reverse santa! i would opt in also..........
 

·
Registered
Joined
·
16,659 Posts
Earl Bonovich said:
I am sure they could...

But then you could get into more of the "legal" mubo jumbo...

What are you sending...
I didn't say you could use my phone line for that...
I don't want them to have that data...
ect...

Very similar to the same nonsense you see about people complaining about the Microsoft "Do you want to send error report data"
So make it an option that's off by default and nothing would get sent unless the customer agrees to it.
 

·
Icon
Joined
·
764 Posts
Yes you can certainly pencil me in as one who would allow that. A dump I would think would help them a lot. And maybe they would actually do something with the info as opposed to MS :lol:

If the routines were installed they could then not upload them unless they have permission. It would be impossilbe to go thru everybodys log of course.

Or maybe pick 10 or 20 people from this forum who are having numerous problems and install traps on their systems. Might be very useful.
 

·
Beware the Attack Basset
Joined
·
24,438 Posts
I'm guessing that the RBR is similar to a BRS (Big Red Switch) so there probably isn't any chance of spitting out a PMD on the way down. In order to do a proper PMD, you have to be able to reliably identify an imminent crash condition in software and they don't seem to have that down yet.
 

·
Icon
Joined
·
764 Posts
harsh said:
I'm guessing that the RBR is similar to a BRS (Big Red Switch) so there probably isn't any chance of spitting out a PMD on the way down. In order to do a proper PMD, you have to be able to reliably identify an imminent crash condition in software and they don't seem to have that down yet.
But I wonder if they could have a small portion of disk space, and log say no more than 1 minute or 2 mins of activity, what buttons are pushed, whats stored in various registers etc. Just keep rolling over, never going over the 2 minute mark. But then of course one would have to stop any activity when he had a problem OR be able to force an upload of the log, so I GUESS NOT!! :) ah well , it was a thought
 

·
AllStar
Joined
·
68 Posts
Discussion Starter · #10 ·
harsh said:
I'm guessing that the RBR is similar to a BRS (Big Red Switch) so there probably isn't any chance of spitting out a PMD on the way down. In order to do a proper PMD, you have to be able to reliably identify an imminent crash condition in software and they don't seem to have that down yet.
Well no, the PMD "buffer" is real time. It tells you what was happening when the crash happened. In the case of lock ups, the micro-processor is in one of several stages. One, it is in such a tight loop you can not get an interrupt to get it to respond or it gets an instruction it does not understand and has no way of getting out of it (both of these have several sub causes). There cause is many, but you need to know where it was when it happened. The buffer is stored on the hard drive and is continually updated.

A "work timer" is also a way of getting out lock ups. All code is run sequentially. When the code is run there is a software timer which runs and gets reset after so many lines of instructions. If the processor gets locked up or looped the work timer times out, resets the processor and sends it to a recovery routine.

Of course these trouble shooting tools are no guarantee to fix every time the processor burps. But it at least can give the code writers some idea of what's going on. It's a lot better than what they have now. I understand the problem they are facing. Without some sort of data it's all hit and miss… and explains 13 updates.

Earl, as far as phone line use, an 800 number of course and a subscriber authorization message (yes/no) such as Microsoft using when Windows crashes… it can be done.
 

·
Lifetime Achiever
Joined
·
21,331 Posts
Just so long as the don't pull a Microsoft--display a screen FULL of data (on a blue background) for 2 seconds, then reboot. I would much prefer something useful that can be captured and sent to Directv....:)

Cheers,
Tom
 

·
Registered
Joined
·
16,659 Posts
tibber said:
Just so long as the don't pull a Microsoft--display a screen FULL of data (on a blue background) for 2 seconds, then reboot. I would much prefer something useful that can be captured and sent to Directv....:)

Cheers,
Tom
Have you gone to the systems properties option, then startup and recovery and unchecked the automatically restart box in Windows?
 

·
Lifetime Achiever
Joined
·
21,331 Posts
RAD said:
Have you gone to the systems properties option, then startup and recovery and unchecked the automatically restart box in Windows?
LOL, yes I have from time to time. I just haven't found the same settings for the HR20. :)

Thank you for seriously trying to help my "problem". I truly was trying to make a joke.

(A few more years of good therapy and maybe I'll develop a sense of humor that is shared by more than 4 people.) :)

Thanks again, Rad,
Tom
 

·
Beware the Attack Basset
Joined
·
24,438 Posts
raw6464 said:
Well no, the PMD "buffer" is real time. It tells you what was happening when the crash happened.
I assumed that the original proposition was a PMD upon RBR . Writing another stream to a hard drive would not go well with those who are hoping for the ability to record more than two channels simultaneously. Buffering to some sort of RAM so it doesn't risk trashing the filesystem might be possible, but assumes that the CPU never subjugates control of everything.
A "work timer" is also a way of getting out lock ups.
Adding the overhead of a deadman timer and similar monitoring routines is something that needed to be planned for from the beginning.
When the code is run there is a software timer which runs and gets reset after so many lines of instructions. If the processor gets locked up or looped the work timer times out, resets the processor and sends it to a recovery routine.
I think you're ignoring the fact that a modern satellite receiver is not a monolithic processor system. There is more than one major processor involved. I would be willing to bet that many of the early problems were related to the decoding hardware gagging on bad or missing data.

To more completely supervise the decoders would add complexity and take away from the typically good performance for the few times that you encounter a problem. A process such as you suggest is often implemented in high speed CMOS until it is no longer needed. The cow is out of the barn and it is probably too late to start adding tools like that.
 

·
AllStar
Joined
·
68 Posts
Discussion Starter · #15 ·
harsh said:
I assumed that the original proposition was a PMD upon RBR . Writing another stream to a hard drive would not go well with those who are hoping for the ability to record more than two channels simultaneously. Buffering to some sort of RAM so it doesn't risk trashing the filesystem might be possible, but assumes that the CPU never subjugates control of everything.Adding the overhead of a deadman timer and similar monitoring routines is something that needed to be planned for from the beginning.I think you're ignoring the fact that a modern satellite receiver is not a monolithic processor system. There is more than one major processor involved. I would be willing to bet that many of the early problems were related to the decoding hardware gagging on bad or missing data.

To more completely supervise the decoders would add complexity and take away from the typically good performance for the few times that you encounter a problem. A process such as you suggest is often implemented in high speed CMOS until it is no longer needed. The cow is out of the barn and it is probably too late to start adding tools like that.
The PMD and sequence work timers work in FAR more complex computers in digital telephone switching systems with hundreds/thousands of processors of which I made my living, and every time you see NASA launch a rocket the SAME technology is still in use.

And I'd venture to say MOST of the code that makes this box run is on the card... that's where the updates are going. I doubt very much any firmware on the motherboard is getting any updates. The update to turn on OTA went on the card, not in any firmware. Granted there are other processors on the motherboard but they are under control and have it interact off the code on the card.

I'll bet EVERY video packet that comes down gets validated by the card before it's put on the screen. Every time you change the channel the code on the card has to authorize the channel. Evertime you push a button the card has to interrupt and tell the rest of the system what it wants. The features that are in the system are controlled by the card. It's the card that holds the code and controls the rest of the system. I would think if you could at least eliminate that as the problem... you have made progress.

Granted PMD and timers are not a fix all. They won't help in cases where there is nothing to trigger or a RBR such as the loss of CID. But when the system locks up such as when the picture freezes and you have lost all control of the system they will at least have someplace to begin analyses. At the very least they can say the problem is NOT in the code... then all the updates to the card are for nill. Right now, the code writers are going over the code blind, doing "what ifs" and writing updates to "see if that helps"... I remind you of the 13 that have not fixed the system... you think if they knew what was going on, it would take them 13 updates? Personally I don't believe they really know if an update will provide "better stability" until after it's launched and they cross their fingers. IF D* has no way of trapping their code they could be at this for YEARS.

Whether or not D* can adapt to some sort of trapping because of the existing platform is another issue...I do not know... perhaps they just don't have any available RAM on the card without turning something off. But if they can trap the code that's they way I would go.

Anyone with a better suggestion I'm all ears.
 

·
Registered
Joined
·
5,501 Posts
With the network now in use, it would be pretty easy to use a netdump on a panic. But it takes a panic. Hangs and lockups don't generally panic the kernel.

I would also recommend a health agent to detect frozen/hung systems to auto reboot.

Spanky
 
1 - 16 of 16 Posts
Status
Not open for further replies.
Top