Is the mitigation for instrumental reasoning level 1 effectively just "scratchpads" and if so, do we have a means of knowing that what's being output to the scratchpad is a direct line to the AI's "subconscious", or if it can be telling us what we want to hear there as well?
The safety plans of AI companies are very preliminary, because we are very early in the process.
All we have so far are LLM, that predict most likely action based on training data. They can be quirky, but not too competent, and lack a solid model of what they are dealing with that goes beyond text, and maybe images.
Next, there will be agents. Those will likely have more interaction with the real world, can use tools, can iterate, do some reasoning, maybe even gain feedback. But even these are likely not going to be too bright.
As such, while we should be mindful that the industry is moving fast, there's likely still way to go.
The focus for now is best on thorough reliability testing as with regular software.
But others are saying, no, the process is much further along. The article quotes Dario Amodei saying ASL-4 by next year. Genuine question, not meant confrontationally, but why would I (a mere passer-by everyday moron) believe you and not him?
Look at the history. Hype people usually fail. Overeager companies crash and burn. Every time they think this time is different. Either way, we will see soon enough.
Shouldn't we expect that agents that can gain feedback on their own but are not too bright quickly become brighter? I think this should be a given, but it's totally possible I'm missing something.
If we have a good architecture, yes. It will likely take a few iterations. Chatbots primarily know language, and that is not enough. Improving the modeling will take time.
But others are saying, no, the process is much further along. The article quotes Dario Amodei saying ASL-4 by next year. Genuine question, not meant confrontationally, but why would I (a mere passer-by everyday moron) believe you and not him?
I agree there is a lot to be confused about in terms of knowing anything for certain. Seems to me like we should assume that we're as far along as anyone is saying (or further), that way we don't under-prepare, no?
It is very hard to prepare for AI. So far, the deployment in the real world is close to zero. Chatbots that respond to questions did not move the needle much.
As in previous waves of tech, deployment and refinement will take time, and it one cannot prepare for what does not exist.
Is the mitigation for instrumental reasoning level 1 effectively just "scratchpads" and if so, do we have a means of knowing that what's being output to the scratchpad is a direct line to the AI's "subconscious", or if it can be telling us what we want to hear there as well?
The safety plans of AI companies are very preliminary, because we are very early in the process.
All we have so far are LLM, that predict most likely action based on training data. They can be quirky, but not too competent, and lack a solid model of what they are dealing with that goes beyond text, and maybe images.
Next, there will be agents. Those will likely have more interaction with the real world, can use tools, can iterate, do some reasoning, maybe even gain feedback. But even these are likely not going to be too bright.
As such, while we should be mindful that the industry is moving fast, there's likely still way to go.
The focus for now is best on thorough reliability testing as with regular software.
But others are saying, no, the process is much further along. The article quotes Dario Amodei saying ASL-4 by next year. Genuine question, not meant confrontationally, but why would I (a mere passer-by everyday moron) believe you and not him?
I think there is a lot of hype by startups that dearly need funding.
That's not a very persuasive response
Look at the history. Hype people usually fail. Overeager companies crash and burn. Every time they think this time is different. Either way, we will see soon enough.
Shouldn't we expect that agents that can gain feedback on their own but are not too bright quickly become brighter? I think this should be a given, but it's totally possible I'm missing something.
If we have a good architecture, yes. It will likely take a few iterations. Chatbots primarily know language, and that is not enough. Improving the modeling will take time.
But others are saying, no, the process is much further along. The article quotes Dario Amodei saying ASL-4 by next year. Genuine question, not meant confrontationally, but why would I (a mere passer-by everyday moron) believe you and not him?
Dario Amodei is a startup founder. His startup will go under unless he makes a profit by next year.
The range of opinions on when AI will arrive is wide, and people like Demis Hassabis and Yann LeCun are more conservative.
Then, the problems are hard. Historically, large-scale infrastructure projects took many years to go from prototype to finished product.
I agree there is a lot to be confused about in terms of knowing anything for certain. Seems to me like we should assume that we're as far along as anyone is saying (or further), that way we don't under-prepare, no?
It is very hard to prepare for AI. So far, the deployment in the real world is close to zero. Chatbots that respond to questions did not move the needle much.
As in previous waves of tech, deployment and refinement will take time, and it one cannot prepare for what does not exist.