Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change feed processor cannot read from the beginning #938

Closed
CMihalcik opened this issue Oct 25, 2019 · 22 comments · Fixed by #954
Closed

Change feed processor cannot read from the beginning #938

CMihalcik opened this issue Oct 25, 2019 · 22 comments · Fixed by #954

Comments

@CMihalcik
Copy link

Describe the bug
A ChangeFeedProcessor configured to read from the beginner per this page is never fed any records.

To Reproduce
Clone https://github.com/CMihalcik/cosmosdb-change-feed-from-beginning, add Cosmos credentials and run.

Expected behavior
The processor's ChangesHandler delegate is called with the earliest items in the collection.

Actual behavior
The processor's ChangesHandler delegate is never called.

Environment summary
SDK Version: 3.3.2
OS Version: macOS 10.13.6

Additional context
When the change feed processor is configured this way, it doesn't seem to be called with any items, even those created after the processor is started.

@bartelink
Copy link
Contributor

bartelink commented Oct 25, 2019

Are you starting from an empty aux collection? (Link to some troubleshooting info with detailed explanations from when I was confused)

@CMihalcik
Copy link
Author

I'm not familiar with the term aux, but if it's the same as a lease collection, then yes, I'm starting with an empty one.

The linked project deletes both the lease and source collections (if they exist) and then recreates them.

@CMihalcik
Copy link
Author

I think I'm seeing he same issue when running the ChangeFeed Usage sample. Sections 2 and 3 don't seem to produce any output.

ChangeFeed-Usage-Output.txt

@bartelink
Copy link
Contributor

I haven't looked at your example, was only trying to rule out stupid stuff ;) (I'm still using V2 as some critical features I rely on are missing in the V3 rendition so far); perhaps the team will respond in due course.

@CMihalcik
Copy link
Author

I'm seeing the same output from the ChangeFeed Usage sample running on Windows 10.

ChangeFeed-Usage-Output.windows.txt

@ealsur
Copy link
Member

ealsur commented Oct 28, 2019

@CMihalcik Can you try deleting the content of the Leases Container, and then starting the Processor?

StartFromBeginning is meant to pickup items that were created before the Processor was ever created.

Are you running against the Emulator or a live Cosmos DB account?

@ealsur
Copy link
Member

ealsur commented Oct 28, 2019

The problem on the "Usages" project is that the Processor's initialization is not a set time, it really depends on the size of the monitored container, so those times are merely trying to linearize work that is happening in other threads.

Having said that, I ran the Usages against the Emulator and I see the expected output:

image

What I noticed in your sample code is that you initialize the Processor, and then wait 5 seconds (https://github.com/CMihalcik/cosmosdb-change-feed-from-beginning/blob/master/Program.cs#L23) with a Thread.Sleep which blocks (try await Task.Delay), and you insert 1 item after (https://github.com/CMihalcik/cosmosdb-change-feed-from-beginning/blob/master/Program.cs#L26).

There are no items being inserted before the Processor is created, so WithStartTime(DateTime.MinValue.ToUniversalTime()) doesn't really do anything.

@CMihalcik
Copy link
Author

@ealsur Thanks for taking a look at this.

Both the source and lease containers are deleted and then recreated each time the script is run.

https://github.com/CMihalcik/cosmosdb-change-feed-from-beginning/blob/c9db374e8016ff08eb46cc382fd25a11def50be9/Program.cs#L34-L47
https://github.com/CMihalcik/cosmosdb-change-feed-from-beginning/blob/c9db374e8016ff08eb46cc382fd25a11def50be9/Program.cs#L51-L64

I'm running against a live Cosmos DB

And although the method name StartChangeFeedProcessorAsync confuses things, an item is inserted into the source container before the change feed is started.

https://github.com/CMihalcik/cosmosdb-change-feed-from-beginning/blob/c9db374e8016ff08eb46cc382fd25a11def50be9/Program.cs#L48-L49

@ealsur
Copy link
Member

ealsur commented Oct 29, 2019

@CMihalcik I cloned your repo and ran it, only adding my account credentials, and I see this output:

image

@ealsur
Copy link
Member

ealsur commented Oct 29, 2019

Second run also works, same logs except the No container to delete:

image

@CMihalcik
Copy link
Author

CMihalcik commented Oct 29, 2019

Here's what I get on macOS 10.13.6. I let each run go for a minute before killing the process.
Screen Shot 2019-10-29 at 9 56 46 AM

I wonder if this is an OS specific issue, perhaps with date formatting.

@ealsur
Copy link
Member

ealsur commented Oct 29, 2019

If you remove the WithStartTime, do you at least see the second item? Are you connecting to the Emulator or a real account? If the later, are you using a read/write key?

@CMihalcik
Copy link
Author

If you remove the WithStartTime, do you at least see the second item?

Yes

Are you connecting to the Emulator or a real account?

A real cosmos instance

If the later, are you using a read/write key?

Yes, I'm using a read/write key

@CMihalcik
Copy link
Author

I think the issue may be that WithStartTime depends on the HTTP request header If-Modified-Since and that Cosmos isn't supporting If-Modified-Since when the Cosmos account is configured for Multi-region writes.

If I debug and break on this line:

https://github.com/Azure/azure-cosmos-dotnet-v3/blob/3.3.2/Microsoft.Azure.Cosmos/src/Resource/QueryResponses/ChangeFeedPartitionKeyResultSetIteratorCore.cs#L71

And inspect the value of result.ErrorMessage I see a string that begins with

Message: {
	"Errors":[
		"StartTime\/IfMofifiedSince is not currently supported when EnableMultipleWriteLocations is set."
	]
}

The Cosmos account I'm using is configured for Multi-region writes (that seems to be the default when creating one through the portal). I spun up a new Cosmos account with Multi-region writes disabled and the project ran as expected.

FWIW, I can't find any documentation about an If-Modified-Since limitation for Cosmos or Change Feed.

@CMihalcik
Copy link
Author

One more thing, it looks like the client API logs an error when this happens, but I wasn't seeing it because I've been running from the command line. On macOS, setting the environment variable COMPlus_DebugWriteToStdErr=1

export COMPlus_DebugWriteToStdErr=1
dotnet run

sends that message to stderr where it looks like this:

DocDBTrace Error: 0 : DocumentClientException with status code BadRequest, message: Message: {"Errors":["StartTime\/IfMofifiedSince is not currently supported when EnableMultipleWriteLocations is set."]}, inner exception: null, and response headers: {
"x-ms-last-state-change-utc": "Thu, 31 Oct 2019 17:10:58.523 GMT",
"lsn": "2",
"x-ms-schemaversion": "1.8",
"x-ms-quorum-acked-lsn": "2",
"x-ms-current-write-quorum": "3",
"x-ms-current-replica-set-size": "4",
"x-ms-documentdb-partitionkeyrangeid": "0",
"x-ms-xp-role": "1",
"x-ms-global-Committed-lsn": "2",
"x-ms-number-of-read-regions": "0",
"x-ms-transport-request-id": "2",
"x-ms-cosmos-llsn": "2",
"x-ms-cosmos-quorum-acked-llsn": "2",
"x-ms-session-token": "-1#2",
"x-ms-request-charge": "1",
"x-ms-serviceversion": " version=2.7.0.0",
"x-ms-activity-id": "a8b84752-01b8-45f6-a9ae-cb8a156cb446",
}

@ealsur
Copy link
Member

ealsur commented Oct 31, 2019

@CMihalcik good catch, I was actually not aware of the lack of documentation. I'll try and bring this up in our existing Change Feed docs, thanks for the investigation!

@svenskmand
Copy link

Hi :)

Will setting a specific start time for a change feed processor be supported on multi-zone/master accounts be supported? We really need this feature for our setup at our company :)

@timsander1
Copy link
Contributor

Hi @svenskmand, just want to confirm that I understand your question. Are you asking about change feed processor support for starting at a specific time for multi-region (but single write region) or multi-region write (multi-master) accounts?

@svenskmand
Copy link

svenskmand commented Aug 27, 2020

Hi @timsander1 :)

for us I think it will be enough for multi-region, but single write region (is this the default setup or is that multi-master?).

Maybe you can also help me understand the parallelism of the change feed processors. I asked a follow up question to this stack-question: https://stackoverflow.com/questions/54121204/using-multiple-consumers-with-cosmosdb-change-feed/54134642#comment112488770_54134642 it seems that I as a user of Cosmos DB cannot control the parallelism of a given change feed processor, is that correct?

@timsander1
Copy link
Contributor

In most cases where customers want multiple regions, they don't necessarily need multi-master (multiple write regions). The default setup is single-region write, not multi-master.

The upper bound for units of parallelization when using the Change Feed Processor is currently your total number of physical partitions.. While you can't direct control the number of physical partitions, the number is based on your provisioned throughput and consumed storage amount. It usually works out that workloads large enough to need significant parallelizing of processing of changes also have a lot of provisioned throughput/storage and have many physical partitions.

We are currently working on removing this dependency on physical partitions though, but it is not typically a blocker for customers.

@svenskmand
Copy link

Hi @timsander1 :)

Thanks for your reply :)

Our problem is that we do not have that much data, around 20 GiB in Cosmos, but we have many change feed processors, we currently have 20 change feed processors. We have many change feed processors because we use a lambda architecture. Naturally the speed of each change feed processor varies greatly, and in a lambda architecture you would just scale the number of instances for the slower change feeds.

But unfortunately this is not possible for us since to scale the number of change feed instances we also need to have much more data :(

The change you mentioned that you are working on, to decouple the instances of change feed processors from the number of physical partitions, will that allow us to decide the parallelism our self? Do you have a public tracking task we can follow? :)

@timsander1
Copy link
Contributor

Let me just clarify my previous comment to make sure we are talking about the same thing (I think we are but just want to make sure :) )

For the 20 different change feed processors (with different processorName and lease container configuration), there's no dependency on the number of physical partitions. You can create as many of these as you'd like.

For a particular deployment unit, the number of instances is bound by the number of physical partitions. If this upper bound is an issue you could either split up the work between different deployment units (basically make the logic in each delegate quicker to execute) or temporarily raise (then lower) throughput to increase the number of physical partitions. For example, you could try raising throughput to 30,000 RUs then lowering back down to your desired amount a few hours later.

You can track progress on User Voice: https://feedback.azure.com/forums/263030-azure-cosmos-db?filter=top&page=2. I don't believe this entry has been submitted yet but please feel free to submit and we will post periodic updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants