processingFailure error (400) while retrieving CommentThreads list
Issue is, we can't really retrieve all the comments of every video.
https://issuetracker.google.com/issues/134912604
We currently don't support paging through the whole stream. So there's no way to retrieve all the 1000+ commentThreads that you have for that video
This is not a solution to your problem. It just shows that querying the endpoint via a GET request method succeeds obtaining from the API the needed page response.
# comments-wget [-d] VIDEO_ID [PAGE_TOKEN]
$ comments-wget() {
local x='eval'
[ "$1" == '-d' ] && {
x='echo'
shift
}
local v="$1"
quote2 -i v
local p="$2"
quote2 -i p
local O="/tmp/$v-comments%d.json"
local o
local k=0
while :; do
printf -v o "$O" "$k"
[ ! -f "$o" ] && break
(( k++ ))
done
quote o
k="$APP_KEY"
quote2 -i k
local a="$AGENT"
quote2 a
local c="\
wget \
--debug \
--verbose \
--no-check-certif \
--output-document=$o \
--user-agent=$a \
'https://www.googleapis.com/youtube/v3/commentThreads?key=$k&videoId=$v&part=replies,snippet&order=relevance&maxResults=100&textFormat=plainText&alt=json${p:+&pageToken=$p}'"
$x "$c"
}
$ PAGE_TOKEN=...
$ AGENT=... APP_KEY=... comments-wget CJ_GCPaKywg "$PAGE_TOKEN"
Setting --verbose (verbose) to 1
Setting --check-certificate (checkcertificate) to 0
Setting --output-document (outputdocument) to /tmp/CJ_GCPaKywg-comments0.json
Setting --user-agent (useragent) to ...
DEBUG output created by Wget 1.14 on linux-gnu.
--2019-06-10 17:41:11-- https://www.googleapis.com/youtube/v3/commentThreads?...
Resolving www.googleapis.com... 172.217.19.106, 216.58.214.202, 216.58.214.234, ...
Caching www.googleapis.com => 172.217.19.106 216.58.214.202 216.58.214.234 172.217.16.106 172.217.20.10 2a00:1450:400d:808::200a
Connecting to www.googleapis.com|172.217.19.106|:443... connected.
Created socket 5.
Releasing 0x0000000000ae57c0 (new refcount 1).
---request begin---
GET /youtube/v3/commentThreads?.../1.1
User-Agent: ...
Accept: */*
Host: www.googleapis.com
Connection: Keep-Alive
---request end---
HTTP request sent, awaiting response...
---response begin---
HTTP/1.1 200 OK
Expires: Mon, 10 Jun 2019 14:43:39 GMT
Date: Mon, 10 Jun 2019 14:43:39 GMT
Cache-Control: private, max-age=0, must-revalidate, no-transform
ETag: "XpPGQXPnxQJhLgs6enD_n8JR4Qk/OUAqOrEpA9YYqmVx0wqn9en_OrE"
Vary: Origin
Vary: X-Origin
Content-Type: application/json; charset=UTF-8
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-XSS-Protection: 1; mode=block
Content-Length: 205965
Server: GSE
Alt-Svc: quic=":443"; ma=2592000; v="46,44,43,39"
---response end---
200 OK
Registered socket 5 for persistent reuse.
Length: 205965 (201K) [application/json]
Saving to: ‘/tmp/CJ_GCPaKywg-comments0.json’
100%[==========================================>] 205,965 580KB/s in 0.3s
2019-06-10 17:41:18 (580 KB/s) - ‘/tmp/CJ_GCPaKywg-comments0.json’ saved [205965/205965]
Note that the shell functions quote
and quote2
above are those from youtube-data.sh (they are not really needed). $PAGE_TOKEN
is extracted from the body
string of the JSON request object posted above.
The next question is: why your python code uses a POST request method? Could it be that this is the cause of your problem?