Dev Tracker — Reddit

The replay system may be the best thing we've ever added to the game, and that may be the best use of the replay system I've ever seen.

Wanted more art, got more art, am happy.

Originally posted by Blezius

Please re-upload the image as it's not showing for everyone.

I'm gonna fix the things. One moment.

Update: Jobs done. 😎

Originally posted by desrai

week 2 being a solo and public server format will be interesting to watch

If you're able to tune in be sure to let us know what you think about it on our social channels. 😃👍

Originally posted by Tony_Danza_the_boss

Shoutout to Epic for doing these and keeping us in the loop. You guys rock

We're all about recognizing pain points and working to make improvements to those in the future. Keep doing a great job at leaving us feedback so we're aware of things we can work on. ❤️

Originally posted by shaf12

/u/EpicEricSW PLS FIX...

This should be fixed in 5.10.

Originally posted by RezQ_YT

Really interesting read, it's really fascinating how this works, and how many people take for granted the sub 20 second queue times. I also have some questions that I would love to hear answers for:

1- Will the playground mode ever make a return in some form or other? Maybe the possibility of adding a listenserver capability for offline play, and if there is a possibility of the playground mode ever coming back, will it have any practice friendly features? Such as an aim training by using npcs.

2- What are your plans for expanding to the MENA Region? We are currently expecting AWS to come to the Middle East in bahrain in early 2019, and I'd love to hear your thoughts on the possibility of the expansion to the MENA region.

3- Is it possible to share any details about the progress of the android release of fortnite? Yesterday brought about a leak from the community based off the files in the latest client. An ini file contained some information about an android engine. So, can we expect fortnite on android to be around the corner? Or is it still early to be hyped about it? What can we expect optimization wise?

Thank you for and epic for making such a wonderful game. I'm really hoping this game gets ever bigger and better by each day.

  1. Playground is coming back, we're just adding some really neat functionality to it before we roll it out again.

  2. Actively evaluating alongside other regional efforts, and we're certainly paying attention to the growth of server infrastructure in that region.

  3. Still in development, but we're making progress!

Originally posted by only1ammo

As a fellow IT pro i have to admit i love this documentation almost as much as I love the game! Keep it coming gang. Is there a developer blog that we can review to get insight on some of your other technology challenges or victory royales?

Our two previous PMs are here:

https://www.epicgames.com/fortnite/en-US/news/postmortem-of-service-outage-4-12

https://www.epicgames.com/fortnite/en-US/news/postmortem-of-service-outage-at-3-4m-ccu

Originally posted by JonDum

That makes sense, thanks for the response.

PS you are pretty lucky to get to deal with fun problems like this!

You're not wrong, I love my job.

Getting a bunch of questions on private hosting of servers, so I figured I'd toss a separate comment in here.

Our game code relies on server-client interactions for many reasons, including but not limited to the sheer processing power required to keep track of 100 simultaneous players, mitigating networking differences between players, ensuring equality between game environments, game changes and fixes without forcing client patches, and anti-cheating efforts.

For these reasons and more, Fortnite is built to be a game that relies on dedicated servers. Re-writing the code to be entirely client-side would be an incredibly complicated effort, and would lose everything listed above.

In addition, consider the cross-platform nature of the game. Some PCs might have the power to host a server (even with only 4 people it's a non-trivial load when you consider crazy Playground activities), but what about consoles or mobile devices? As powerful as they are and continue to get every generation, they're not optimized to serve as game servers alongside running the game client.

For this reason, we'd have to develop two different games (server-hosted for consoles/mobile and some PCs, and peer-to-peer for other PCs) just to run Playgrounds, and that's just not a realistic option.

Oh, and we already fixed the matchmaking problem :)

Originally posted by StereoZ

Why does it have to constantly check with the server tho?

Seems to be the real question here, it's a private server hosted locally so I don't see the need for that to happen.

/u/HardkoreParkore hit on a few of the major issues, one more to consider is the cross-platform nature of this game. Some PC players may have the kind of hardware to host a server, but what about Mobile players? We strive for a world-class console experience, but if we try to host Playground servers on one of our many consoles there too we would likely negatively affect performance.

Because of the cross-play nature of the mobile/console/PC ecosystem we'd either need all platforms to be able to host servers, or have both server-hosted and peer-hosted versions, which is just not sustainable.

Originally posted by JonDum

/u/JShredz I'm probably just not use to the scale that you guys are dealing with, but how large are these lists of available servers? So large that it's infeasible to fit all game servers in the region in memory? If you're just storing ip/dns name and some metadata about status, # of players/matches etc, I'd imagine it's only max a mb or so per entry. Even with 10,000+ game servers that'd be only ~10GB memory per MMS node. So why not scale it a bit more vertically with increased memory per mms node rather than try to get free servers from other nodes' lists?

Or, why not have a couple t2.xlarge instances act as the Masters of the Server List™ that each mms node use as a queue to get a game server from as they work to coordinate players together?

Sorry if this is totally off base, nonetheless I really love your technical posts and learn a great deal from them.

The lists are REAL big.

The issue is less total exhaustion of a local list than exhaustion of a particular region within that list, which forces the node to go run to another list to search for a server for a corresponding region. You can imagine that the longer each list gets, the more expensive the case where a local list isn't sufficient and a non-local search takes place. While we believe now that the local lists should theoretically not be exhausted, there are edge-cases (eg. Rocket Launch) where events can drive extremely atypical matchmaking behavior, and we want to make sure a local exhaustion doesn't back up the whole system.

Originally posted by fantasydrama

Nice. Would really like one for the end game lag we saw in the comp scene once y’all figure it out.

Yup, that will be coming out in the next few days.

Originally posted by Slinky621

Didn't they say this the moment it released back on the 2nd? I know I saw almost the exact comment in a thread like this from an Epic employee

That was me! I provided what info/context I could as things were happening, this is just a fuller and more complete explanation written with the benefit of additional time and discussions with the rest of the team.

Originally posted by Revolving_DCON

Can we expect more technical blogs like this in the future in regards to general updates appose to only when there's a major outrage?

I worked on the xbox 360 and a lot has changed in the networking landscape since then, this was a fascinating read thank you!

It's really great to gain more insight to how the standards have evolved!

That's the goal! We just think they're extra important when things go wrong, so diverting the necessary time/resources to get these prepped and written is more urgent to keep with the philosophy of transparency. In the future we'd love to write more about all the awesome things we're doing that aren't the result of live issues!

Originally posted by UNCTillDeath

Assume Santa is real. All the kids in the world send him letters asking for gifts that his elves read. The elves then build these gifts. Each elf has a workstation with enough resources to make a lot of toys. If they run out then they have to go to the main supply room to get more stuff. This system works well for the entire population.

All of a sudden the world kid population multiplies by 4 or more times. Each elf pretty much instantly runs out of resources and then goes to the main supply room for more. The guy in charge of this room is like, "da fuq?" Because he's literally getting bombarded by hundreds of elves who need things. This causes production to grind to a stop. Ruh-roh.

I don't have a good analogy for this part but the supply guy also has to do some logic to make sure each elf gets a diverse number of resources to make sure every kid (hopefully) gets a toy. This makes his brain hurt and production becomes even slower.

In this case, Santa is Epic, the elves are the MMS nodes, the resources are servers, the supply room is the server stockpile and we are the kids. The stopped production of toys is the matchmaking outage

I have absolutely no idea the details and this is just a guess, but I think they may have restructured the way the list is maintained and built some separate service in charge of keeping the list populated. Essentially Santa hired another set of Elves who keep adding resources to everyone's pile to make sure that backlog doesn't happen again.

Super solid analogy, you got it!

Originally posted by OhhYesMommy

i have no idea what the fuck that all means, but thank you

Each "node" or piece of our matchmaking service (we have a lot of identical parts that make up the whole) has a big list of free servers it can match players up with. Normally, it "gives out" a server to a group of players and then can refresh its list to stay full from a big shared stockpile of free servers.

For the Playground launch, a huge wave of people burst in and took all of the servers on each list faster than the system could add more from the stockpile. When a node doesn't have the server a player asks for on its list, it takes a long time (computationally speaking) to go fetch it from the stockpile.

We made the process of maintaining the lists a lot easier, so the odds of a node ever running out of available servers on its list is now very low.

Edit: The above is about 80% accurate (full PM has more details), but hopefully gives you a better rough understanding.

Originally posted by areyoudizzzy

Could you perhaps still load 100 players per game but just make any players and their builds/actions invisible to players outside their party and turn friendly fire back on? Wouldn't that have similar stress to the servers as standard matches?

Edit: other than each game lasting an hour instead of ~30mins

Not quite, since servers would have to calculate things differently for each group (invisible is one thing, but what you really mean is non-interacting entirely) it would be like functionally running distinct servers for each.

Originally posted by SebRev99

So tomorrow's maintenance is about this? Not the mode itself but the upgrade of the matchmaking service?

So.. (noob question) can we expect lower ping?

Yes to the first question, tomorrow's downtime will be to carry over what we learned and built for Playground to the main matchmaking fleet (along with a few other upgrades).

As to the second question, this will improve matchmaking but not likely in-match connections. We're always working to try to improve things as much as we can on our end, but a lot of it relies on geography and worldwide internet infrastructure.

Originally posted by Ice_Occultism

Our matchmaking is built on something called the Matchmaking Service (MMS), which is responsible for facilitating the “handshake” between players looking to join a match and an available dedicated server open to host that match. Each node in the matchmaking cluster keeps a large list of open dedicated servers that it can work with, randomly distributed by region to keep a roughly proportional amount of free servers for each. Players that connect to MMS request a server for their region, MMS assigns that player to a node, and the node picks a free server for the requested region from its list.

I am curious, is there any sort of preference for lower ping when being assigned to a server? Assuming for example in the European region you have servers in various countries, would one be placed for preference to an open server closer to their location. It feels like sometimes I get around 40-50 ping in some EU servers but other times I get around 70 ms and I was wondering if it could be down to where I am connecting to.

Most of that is just that true in-match ping fluctuates by a fair bit based on a whole range of factors unless you're right next door to the data center, but we're also working to add more levers and knobs to how we route players to improve the experience!

Originally posted by Vaniitio

F

I deserve this.

Originally posted by DoNotTakeMeSeriosly

No clever pun? Is everything good Mr.Epic?

I shall try to redeem myself by providing some more context.

 

Here we go! We're going to be doing backend maintenance that will apply improvements to our matchmaking services as well as deploying updates to our session trackers. 😎

 

Mourn my lack of puns with F in chat.

Scheduled maintenance is planned for tomorrow, July 19. Downtime will begin at 4 AM ET (0800 GMT).